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1.  Introduction 

The  theory  of  excess  burden  and  optimal  commodity  taxation  is  one  of  the  oldest 
subjects  of  study  in  public  finance,  dating  back  to  Dupuit  (1844),  and  yet  is  also 
closely  associated  with  the  rapid  analytical  development  of  the  field  which 
commenced  in  the  early  1970s.  Perhaps  more  than  in  most  areas  of  economics, 
there  has  been  a tendency  to  overlook  contributions  made  in  earlier  decades.  As  a 
result,  much  of  the  “new”  public  economics  of  the  last  decade  may  be  viewed,  in 
part,  as  a restatement  and  extension,  perhaps  in  less  arcane  language  and 
terminology,  of  previously  proven  propositions. 

Probably  the  most  celebrated  example  of  such  “rediscovery”  is  that  of  Ramsey’s 
(1927)  derivation  of  optimal  commodity  tax  formulae,  now  referred  to  as  the 
Ramsey  rule.  The  lapse  here  is  even  harder  to  understand  in  that  Ramsey’s  results 
were  succinctly  described  in  Pigou’s  classic  public  finance  text  (1947)  and 
rederived  by  Boiteux  (1956).  The  deadweight  loss  “triangles”  made  popular  by 
the  work  of  Harberger  (1964)  were  considered  by  Hotelling  (1938),  and  appear 
implicitly  in  Dupuit  (1844): 

“It  follows  that  when  the  change  in  consumption  brought  about  by  a tax  is 
known,  it  is  possible  to  find  an  upper  limit  to  the  amount  of  the  utility  lost 
by  multiplying  the  change  in  consumption  by  half  the  tax.”1 

Indeed,  the  generalization  of  such  excess  burden  formulae  by  Boiteux  (1951) 
and  Debreu  (1951,  1954)  has  until  recently2  been  almost  entirely  ignored  in  the 
subsequent  literature.  Even  the  “Laffer  curve”,  popular  for  a time  among  non- 
economists, might  more  appropriately  be  called  the  “Dupuit  curve”: 

*1  am  grateful  to  Angus  Deaton,  Avinash  Dixit,  Liam  Ebrill,  Jerry  Hausman,  Mervyn  King,  Randy 
Mariger,  Jack  Mintz,  Harvey  Rosen,  Efraim  Sadka,  Jon  Skinner,  Nick  Stern  and  Lars  Svensson  for 
comments  on  an  earlier  draft. 

1 Dupuit  (1844). 

2 See,  for  example,  Diewert  (1981). 
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“If  a tax  is  gradually  increased  from  zero  up  to  a point  where  it  becomes 
prohibitive,  its  yield  is  at  first  nil,  then  increases  by  small  stages  until  it 
reaches  a maximum,  after  which  it  gradually  declines  until  it  becomes  zero 
again.  It  follows  that  when  the  state  requires  to  raise  a given  sum  by  means 
of  taxation,  there  are  always  two  rates  of  tax  which  would  fulfill  the 
requirement,  one  above  and  one  below  that  which  would  yield  the  maximum. 
There  may  be  a very  great  difference  between  the  amounts  of  utility  lost 
through  these  taxes  which  yield  the  same  revenue.”3 

The  purpose  of  this  chapter  is  to  present  the  chronological  development  of  the 
concept  of  excess  burden  and  the  related  study  of  optimal  tax  theory.  A main 
objective  is  to  uncover  the  interrelationships  among  various  apparently  distinct 
results,  so  as  to  bring  out  the  basic  structure  of  the  entire  problem. 


1.1.  Outline  of  the  chapter 

Any  discussion  of  welfare  economics  inevitably  begins  with  the  problem  of 
welfare  measurement,  which  in  the  present  context  involves  a treatment  of 
Marshall’s  consumers’  surplus  and  its  relationship  to  Hicks’  (1942)  notions 
of  compensating  and  equivalent  variations.  These  are  discussed  in  Section  2, 
where  special  attention  is  paid  to  the  distinction  between  the  measurement  of  the 
welfare  effects  of  price  changes  and  the  distortionary  impact  of  tax  changes. 
Section  3 develops  the  various  measures  of  excess  burden,  focusing  on  issues  of 
approximation,  informational  requirements  and  aggregation  over  individuals,  and 
the  effects  of  a more  general  technology  than  the  commonly  supposed  one  with 
fixed  producer  prices.  Section  4 reviews  some  of  the  empirical  attempts  to 
estimate  various  deadweight  losses.  Section  5 presents  and  interprets  the  basic 
rules  for  optimal  commodity  taxation,  including  a discussion  of  the  role  of  profits 
taxation  and  the  desirability  of  production  efficiency.  The  analysis  in  Section  6 
concerns  the  relative  desirability  of  direct  and  indirect  taxation  and  the  structure 
of  individual  preferences.  Section  7 presents  some  applications  of  optimal  tax 
theory  to  questions  such  as  the  provision  of  public  goods,  correction  of  externali- 
ties, and  the  allocation  of  risk.  Finally,  in  Section  8,  we  explore  the  issue  of  tax 
reform,  as  distinct  from  de  novo  tax  design.  This  literature  dates  back  to  Corlett 
and  Hague  (1953-54),  and  asks  whether  specified  local  movements  away  from  an 
initial  suboptimal  equilibrium  will  improve  social  welfare.  In  general,  movement 
of  prices  in  the  direction  of  their  optimal  levels  does  not  guarantee  such  an 
improvement. 


3Dupuit,  op.  cit.,  p.  278.  For  this  particular  rediscovery,  I am  indebted  to  the  historical  analysis  of 
Atkinson  and  Stem  (1980). 
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2.1.  Consumers’  surplus  and  the  Hicksian  variations 

We  begin  with  Marshall’s  (1920,  p.  811)  diagram,  in  Figure  2.1,  depicting 
consumers’  and  producers’  surplus.  The  consumers’  surplus  is  defined,  somewhat 
vaguely,  to  be  the  amount  that  consumers  would  pay  in  excess  of  the  amount  they 
are  paying,  p0x0,  for  the  amount  they  are  purchasing,  x0.  Interpreting  the 
demand  curve  as  an  expression  of  willingness  to  pay,  we  obtain  area  A as  such  a 
measure  by  integrating  the  vertical  gap  between  the  demand  curve  and  pQ  over  x. 
Similarly,  interpreting  producers’  surplus  as  the  level  of  profits  received  in 
supplying  the  quantity  sold,  and  assuming  that  competitive  supply  causes  the 
marginal  social  cost  to  coincide  with  the  supply  schedule  S,  we  obtain  the  area  B. 
The  sum  A + B is  maximized  when  price  equals  marginal  cost,  and  changes  in 
each  measure  following  from  a price  change  are  easily  calculated.  For  example,  if 
the  price  rises  from  p0  to  pv  the  change  in  consumers’  surplus  is  the  area  of  a 


Figure  2.1.  Consumers’  and  producers’  surplus. 
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trapezoid  which  equals 

AS  = - fP'x(p)dp,  (2.1) 

JPo 

where  x(-)  is  the  demand  function  with  respect  to  the  good’s  own  price,  holding 
other  prices  fixed.4 

The  basic  problem  with  consumers’  surplus  as  a welfare  measure  is  that  it  does 
not  come  directly  from  underlying  consumer  preferences.  As  a result,  it  has  the 
serious  flaw  of  path-dependence:  if  more  than  one  price  changes,  the  order  in 
which  the  trapezoids  in  (2.1)  are  calculated  matters.  That  is,  if  we  let  x'  and  p'  be 
the  quantity  demanded  and  price  in  the  ith  market,  the  sum  of  individual  changes 
in  consumers’  surplus,  A S',  i.e.,  the  line  integral 

AS  = £AS  = - r'£x'd p‘,  (2.2) 

I JPo  i 

takes  on  different  values  according  to  the  path  of  integration  from  the  initial  price 
vector  p0  to  the  ultimate  price  vector  pv  To  see  this,  consider  a simple  example 
with  two  markets.  If  we  change  the  price  in  market  1 first,  the  change  in  surplus  is 

ASX=  - fp'x1(p1,p^)dp1-fPlx2(p\,p2)dp2,  (2.3a) 

Jp\  jpI 

while  if  we  change  the  price  in  market  2 first,  we  obtain 

A S2  = - (P'xl(p\p\)dp1  - fPlx2(p10,p2)(dp2).  (2.3b) 

JPo  JPo 

Subtracting  ASj  from  A S2,  we  obtain 

AS2  - ^ = - fPl  [xx(  p\  p\)  - x\  p\po)\  dp 1 

JPo 

+ [P'[x2(p\,p2)-x2(plp2)\dp2.  (2.4) 

JPo 


4Note  that,  by  integrating  (2.1)  by  parts,  we  obtain  the  formula  for  AS  based  on  the  difference 
between  the  two  levels  of  surplus  themselves,  i.e., 

AS=  fx(Pl  )p(x)dx-[p1x1  - p0x0]=  fMPl)p(x)dx  - ptxt  - jMpo)p(x)dx -^0^ol  • 
A(/.„)  Jo  l-'o  J 
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For  this  term  to  equal  zero,  it  must  generally  be  zero  over  all  subintervals  between 
p0  and  pv  In  particular,  for  small  changes  in  pl  and  p2,  with  pi  = pi  + dpi  and 
Pi  = p\)  + d p\  (2.4)  becomes 


A52  - ASt 


%x2(pl0,pl) 

<v 


dp1  dp2 


%xl{pl>pl) 

dp2 


dp1  dp2 


(2.5) 


which  equals  zero  only  if  the  cross-price  derivatives  dxl/'dp2  and  dx2/dp1  are 
equal.5  Such  symmetry  holds  for  compensated  demands:  the  Slutsky  matrix  is 
symmetric  [Hicks  (1946)].  However,  ordinary  demand  derivatives  also  possess 
income  effects  that  are  not  generally  equal. 

The  path-dependence  problem  does  not  arise  from  surplus  measures  based  on 
compensated  commodity  demands,  for  which  the  symmetry  property  holds.  Here, 
however,  we  face  a different  question:  since  utility  does  change  with  the  change  in 
prices,  which  utility  level  should  be  used  as  a reference  level  for  the  compensated 
demand  functions?  Two  natural  candidates  are  the  levels  of  utility  prevailing 
before  and  after  the  price  changes.  Following  Hicks  (1942),  we  define  the 
compensating  variation  of  a price  change  to  be  that  amount  of  income  the 
consumer  must  receive  to  leave  utility  unaffected  by  the  price  change,  and 
the  equivalent  variation  as  the  amount  of  income  the  consumer  would  forego  to 
avoid  the  price  change.  By  definition,  the  compensating  variation  of  a price 
change  from  p0  to  pl  equals  the  equivalent  variation  of  a change  from  px  to  p0. 
Using  the  expenditure  function,  defined  by  the  minimization  of  expenditure  at 
given  prices  to  satisfy  a given  level  of  utility: 


E(p,U)  = mm(p  ■ x)  subject  to  U(x)>U,  (2.6a) 

we  may  express  concisely  the  equivalent  and  compensating  variations  as  E(p,U) 
— E(p0,  U ),  where  U is  the  pre-change  utility  level  in  the  case  of  the  compensat- 
ing variation,  and  the  post-change  utility  level  in  the  case  of  the  equivalent 
variation.  Letting  y be  the  consumer’s  actual  income,6  we  can  express  these  two 
measures  as  functions  of  prices  and  income  alone  through  use  of  the  indirect 
utility  function,  V(p,  y),  defined  by 

V(p,  y)  — maxU(x)  subject  to  p x>y.  (2.6b) 

Substituting  (2.6b)  into  (2.6a),  we  obtain  for  the  compensating  variation  of  a price 


5See  Hotelling  (1938)  for  the  original  statement  of  this  result. 

6y  should  be  thought  of  as  a comprehensive  “full  income”  measure  not  affected  by  individual 
decisions  regarding,  for  example,  labor  supply.  This  is  discussed  further  in  Section  5 below. 
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change  from  p0  to  /»,, 

CV(p0,p1)  = E(p1,V(p0,y))-E(p0,V(p0,y)) 

= E(Pl,V{p0,y))-y,  (2.7a) 

and  for  the  corresponding  equivalent  variation, 

EV(P0’Pi ) = E(Pu  V(Piy))  ~ EiPo,  y(Pi,  r)) 

= y-E(p0,V(pl,y)),  (2.7b) 

[where  we  use  the  identity  y — E(p,  V(p,  y))]. 

These  measures  may  be  depicted  graphically.  By  the  envelope  theorem,  the 
derivative  of  the  expenditure  function  with  respect  to  an  individual  price  p‘  is 


Figure  2.2.  Compensating  and  equivalent  variations. 
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simply  the  Hicksian  or  compensated  demand  x'c(p,U).  Thus,  either  of  the 
Hicksian  variations  may  be  expressed  (for  the  appropriate  value  of  U)  as 

/Pi  (\  p rP\  

— (p,U)dp  = / xc(p,U)dp.  (2.8) 

I>0  aP  P 

Since  the  cross-price  derivatives  are  symmetric  for  compensated  demands,  these 
measures  are  path-independent.  For  the  case  of  a single  price  change,  they  may  be 
easily  compared  to  the  simple  change  in  consumers’  jiurplus,  which  is  then 
well-defined.  This  is  shown  in  Figure  2.2,  where  DC(U ) is  the  compensated 
demand  curve  corresponding  to  the  compensated  demands  xc(p,U),  drawn 
more  steeply  than  the  ordinary  demand  curve  D under  the  assumption  of 
normality.  The  ordinary  consumers’  surplus  changes  by  the  area  A + B with  an 
increase  in  price  from  p0  to  p1.  The  compensating  variation  of  the  change  equals 
the  area  A + B + C,  while  the  equivalent  variation  equals  the  area  A . The 
bracketing  of  the  Marshallian  measure  by  the  two  Hicksian  measures  was 
emphasized  by  Hicks  (1942)  and  Willig  (1976)  in  their  attempts  at  rehabilitation 
of  consumers’  surplus  as  a welfare  measure.  However,  their  argument  becomes 
weaker  when  more  than  one  price  changes,  for  then  consumers’  surplus  is  not 
even  single-valued.  Moreover,  for  estimating  the  excess  burden  of  a tax,  it  is  not 
the  entire  loss  to  the  consumer  in  which  we  are  interested  but  rather  the  loss  in 
excess  of  revenue  collected.  It  turns  out  that  in  such  a case,  the  felicitous  outcome 
with  respect  to  the  relative  sizes  of  the  three  measures  no  longer  holds. 


2.2.  Definitions  of  excess  burden 

The  deadweight  loss  from  a tax  system  is  that  amount  that  is  lost  in  excess  of 
what  the  government  collects.  Unfortunately,  while  this  definition  makes  intuitive 
sense,  it  is  too  vague  to  permit  a single  interpretation. 

We  begin  again  with  the  simple  Marshallian  approach,  which  is  adequate  for 
purposes  of  illustrating  the  concept  of  excess  burden  in  a single  market.  We  can 
see  the  effects  of  a tax  t in  Figure  2.3.  By  raising  the  consumer  price  from  p0  to 
px  4-  t,  the  tax  reduces  consumers’  surplus  by  the  area  A + B.  Producers’  surplus 
is  reduced  by  C + D,  by  the  drop  in  producer  price  to  px,  but  tax  revenues 
amount  only  to  A + C,  yielding  a social  loss  of  B + D,  or  approximately  \l(x0  — 
xx)  = - \t  Ax,  as  suggested  by  Dupuit. 

A key  aspect  of  this  measure  is  that  it  is  greater  than  zero  whether  the  tax  is 
positive  or  negative.  The  case  of  a subsidy  at  rate  s is  depicted  in  Figure  2.4. 
Here,  there  is  an  increase  in  consumption  to  and  consumers’  surplus  and 
producers’  surplus  both  rise  by  the  areas  H + I and  FAG,  respectively.  But  the 
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Figure  2.3.  Excess  burden  of  a tax. 


amount  of  the  subsidy  exceeds  those  gains  by  the  area  J,  equal  to  5s  Ax  or, 
again,  - \t  Ax  for  t=  -s  being  the  algebraic  value  of  the  tax.  The  loss  comes 
from  the  distortion  of  a Pareto  optimal  allocation,  not  simply  the  reduction  in 
output. 

For  the  case  where  a tax  already  exists,  we  may  ask  what  additional  excess 
burden  would  be  caused  by  a tax  increase.  In  this  case,  we  subtract  the  change  in 
government  revenue  from  the  change  in  producers7  and  consumers’  surplus,  since 
revenue  is  positive  at  the  initial  point.  The  resulting  measure  is  shown  in  Figure 
2.5. 

By  raising  the  consumer  price  from  px  + t1  to  p2+  t2,  the  tax  causes  a loss  in 
consumers’  surplus  of  A + B.  Producers’  surplus  declines  by  C + D,  and,  as 
before,  the  government  collects  additional  revenue  on  the  purchases  x2  equal  to 
(t 2 — • x2,  or  area  A + C.  However,  the  government  loses  the  revenue  it  was 

collecting  on  the  purchases  in  excess  of  x2,  equal  to  area  E.  Thus,  the  welfare  loss 
of  the  tax  increase  equals  the  trapezoidal  area  B + E + D,  or  approximately 
— (t  Ax  + yA; Ax).  Thus,  even  if  A t is  very  small,  the  additional  excess  burden 
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Figure  2.4.  Excess  burden  of  a subsidy. 

need  not  be,  unlike  in  the  case  where  no  tax  exists  initially:  there  is  now  a 
first-order  welfare  loss  resulting  from  marginal  tax  changes. 

If  we  wish  to  consider  the  effects  of  several  taxes  at  once,  we  must  use  more 
sophisticated  measures  based  on  the  Hicksian  variations.  For  the  remainder  of 
this  subsection,  we  focus  on  the  case  of  a single  consumer  facing  fixed  producer 
prices.  These  restrictions  are  relaxed  in  Section  3. 

Using  the  equivalent  variation,  Mohring  (1971)  suggests  that  the  excess  burden 
of  taxation  is  the  amount  in  excess  of  taxes  being  collected  that  the  consumer 
would  give  up  in  exchange  for  the  removal  of  all  taxes;  that  is,  how  much  more 
could  be  collected  from  the  consumer  (and  thrown  away)  than  is  currently  being 
collected,  with  no  loss  in  utility,  if  the  collection  method  were  lump  sum  taxation. 
In  the  terminology  used  above,  we  may  write  this  measure  as 


EBe  = E(Pl,V(Pl,  y))  - E(p0,V(pu  y))  - R(Pl,  y) 

=y- E(po>v(Pi,y))-(Pi  -Po)-*(Pi>y), 


(2.9) 
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Figure  2.5.  Excess  burden  with  a pre-existing  tax. 


where  R(px,y)  is  the  tax  revenue  collected  when  prices  are  at  p y and  the 
consumer’s  income  equals  y. 

Alternatively,  Diamond  and  McFadden  (1974)  suggest  the  use  of  the  com- 
pensating variation  by  defining  excess  burden  to  be  that  amount,  in  addition  to 
revenues  collected,  that  the  government  must  supply  to  the  consumer  to  allow 
him  to  maintain  the  initial  utility  level.  That  is,  how  much  must  come  from 
“outside”  the  system  to  compensate  for  the  tax  distortion.  To  avoid  double-count- 
ing, we  include  in  the  government’s  revenue  the  additional  amount  it  collects 
because  the  individual  is  compensated  and  (for  a normal  good)  demands  more  of 
the  taxed  commodity.  Thus,  the  Diamond-McFadden  measure  may  be  written 

EBc  = E ( p x , V(  p0 , y ))  - E ( pQ , V( p0 , y ))  - R ( p i , E ( p x , V(  p0 , y ))) 

= E(p1,V(p0,y))-y-(pl  -p0)-x{Pi,E(Pl,V(p0,y))) 

— E(px,V(p0,  y))  - y -(pl-  p0)  ■ xc(px,V{p0,  y)), 


(2.10) 
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[where  the  last  step  uses  the  identity  x(p,  E(p,U))  = xc(p,U)].  As  with  EBe, 
EBC  must  be  non-negative. 

For  a single  price  change,  these  two  measures  of  excess  burden  may  be 
graphically  compared  to  the  Marshallian  measure  shown  in  Figure  2.3.  The  three 
measures  together  are  shown  in  Figure  2.6.  To  obtain  the  equivalent  variation 
measure  or  the  consumers’  surplus  measure  of  excess  burden,  we  subtract  the 
revenue  actually  collected  at  x(p,  y)  from  the  respective  measures  shown  in 
Figure  2.2.  For  the  compensating  variation  measures,  we  subtract  the  revenue  that 
would  be  collected  if  utility  were  kept  at  V(p0,y).  This  yields  the  areas  A, 
A + B,  and  C for  the  three  respective  measures.  Note  that  the  two  Hicksian 


Dc(V(p0,y)) 


Figure  2.6.  A comparison  of  excess  burden  measures. 


72 


A Ian  J.  A uerbach 


measures  no  longer  bracket  the  Marshallian  one.7  If  the  taxed  good  is  normal,  the 
latter  is  necessarily  larger  than  each  of  the  former,  and  the  discrepancy  may  be 
quite  large. 

Other  logical  measures  of  excess  burden  involving  the  equivalent  and  com- 
pensating variations  may  be  conceived.8  In  addition,  it  is  easy  to  adapt  the  two 
measures  already  derived  to  the  case  where  the  initial  equilibrium  is  not  Pareto 
optimal  but  is  already  distorted  by  taxes.  The  equivalent  variation  measure  of 
additional  excess  burden  would  then  be  the  amount,  in  excess  of  additional  tax 
revenues,  that  the  consumer  would  pay  to  avoid  the  latest  price  increase  from  p , 
to  p2, 

EBb=  E(p2,V(p2,  y))-  E(Pl,V(p2,  y)) 

-[*(p2> y)- r{pi>  t)))] 

= y-  E{pl,V{p2,  y))-(p2-p0)-x(p2,  y) 

+ (Pt -Po)-xc{PuHpny)) 

=y  ~e(pu  HP2>y))-(P2~Pi)-x(p2,y) 

+ (pi  ~ Po)-(*c(Pi > v(pi,  y))~x(p2,  y)).  (2.11) 

Comparing  (2.11)  with  (2.9),  we  find  that  (2.11)  contains  an  additional  expression 
representing  the  reduction  in  tax  revenues  as  demand  declines  with  the  new  rise  in 
price,  with  utility  held  constant  at  V(p2,  y).  This  additional  term  corresponds  to 
that  found  for  the  basic  consumers’  surplus  measure  in  Figure  2.6.  Likewise,  the 
compensating  variation  measure  would  be  the  amount  in  excess  of  the  change  in 
revenues  that  would  be  required  to  maintain  the  initial  utility  level,  or 

EBC  = E(p2,V(Pl,  y))  - E{Pl,  Vip^y)) 

-[R(p2,E(p2,V(Pl,  y)))-R{Pl,  y)\ 

= E(p2,V(Pl,  y))  -y  ~(p2-  Pq)  ■ xc(p2,V(p1,  y)) 

+ (Pi~  Po)-x(Pi,y) 

= E(p2,V(pl,y))-y-{p2-pl)-xc(p2,V(pl,y)) 

+ (Pi  ~Pq)-(x(Pi>  y)~xc{p2,V{Pl,  y))), 


7This  was  pointed  out  by  Hausman  (1981a),  among  others. 
8See  Auerbach  and  Rosen  (1980)  for  further  discussion. 
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where  the  additional  term  compared  to  (2.10)  is  the  revenue  lost  as  demand 
declines  with  utility  held  constant  at  V(pv  y). 

3.  Evaluating  the  measures  of  excess  burden 

3.1.  Taylor  approximations  and  informational  requirements 

For  purposes  of  exposition,  it  is  sometimes  easier  to  express  the  deadweight  loss 
calculations  above  in  terms  of  second-order  Taylor  approximations.  For  example, 
if  we  expand  the  exact  measure  EBC  around  the  initial  price  vector  px,  we  obtain 

d EBC  , . ,,  d 2EBc  . . , 

EBC^  —T~—  -(P2-Pl)+  HPl-Pl)  , 2~  (P2~Pl)+  ■ ■ ■ , (3-1) 

dp  />.  <*P  /,i 

which,  ignoring  all  terms  beyond  the  second  order,  yields 
EBC»  -(Pi-Po)'  ^ ( Pi~Pi ) 

+ i(P2~PiY  - ^ ~(Pi~Po)~f  ( Pi-Pi ) . (3-2) 

where  jcc  is  evaluated  at  pv  and  V(px,  y).  If  we  make  a further  approximation 
by  ignoring  the  curvature  terms  of  the  compensated  demand  function  d2xc/dp2, 
we  obtain 

EBC=*  -(t'SAt  + \At'SAt)=  ~(t'Axc  + \At'Axc),  (3.3) 

where  t = (p1—  p0).  At  = (p2—  px),  S = dxc/dp  is  the  Slutsky  matrix,  and 
Aorc  = SAt.  This  is  of  a form  similar  to  the  single-market  measure  derived  above 
for  simple  consumers’  surplus,  but  the  changes  in  demand  are  now  compensated 
changes  rather  than  ordinary  ones.  The  approximation  in  (3.3)  is  that  originally 
derived  by  Harberger  (1964),  although  the  procedure  used  to  derive  it  here  is 
somewhat  simpler.9 

From  (3.3),  we  may  observe  a number  of  additional  characteristics  of  tax- 
induced  excess  burden.  First  of  all,  when  there  are  pre-existing  taxes  in  other 

9 One  can  also  derive  higher-order  approximations  of  EBC.  For  a comparison  of  the  errors  involved 
in  using  second-  and  third-order  approximations,  see  Green  and  Sheshinski  (1979). 
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markets,  the  introduction  of  another  tax  need  not  worsen  things.  We  must  weigh 
the  strictly  positive  term  — (A/,)25„  for  the  new  tax  in  market  i against  the 
cross-effects  — t A/,  in  each  other  market  j,  which  represent  the  loss  in  revenue 
from  the  tax  t f due  to  the  drop  in  demand  resulting  from  the  price  increase  in 
market  i.  Since  S)t  may  be  positive  or  negative,  so  may  each  of  those  terms.  In 
general,  if  pre-existing  taxes  are  on  goods  substitutable  for  good  i (5,,  > 0),  the 
new  tax  is  more  likely  to  lessen  the  total  excess  burden  of  the  tax  system. 

A second  observation  to  make  from  (3.3)  is  that  excess  burden  is  a non-linear 
function  of  tax  rates.  Consider,  for  example,  a single  tax  f,  imposed  upon  a state 
without  taxes.  The  excess  burden  is  approximately  — jtfSu,  so  that  it  increases 
with  the  square  of  the  tax.  This  suggests  that  to  raise  a certain  amount  of  revenue, 
we  might  reduce  excess  burden  by  using  several  small  taxes  rather  than  a few 
large  ones,  perhaps  tilting  toward  those  with  smaller  own-substitution  effects  for 
which  the  scale  of  excess  burden  is  lower.  However,  once  several  taxes  are  used, 
the  cross-effects  just  discussed  need  also  be  evaluated.  How  these  aspects  fit 
together  will  become  clearer  in  Section  5 when  we  formally  consider  the  optimal 
tax  problem. 

Aside  from  expositional  purposes,  the  use  of  a Taylor  approximation  can  only 
be  justified  on  grounds  of  insufficient  information.  If  we  know  the  consumer’s 
expenditure  function,  we  can  calculate  either  of  the  exact  measures  of  excess 
burden  explicitly.  Even  if  we  know  only  the  consumer’s  ordinary  demand 
function,  we  can  solve  for  his  indirect  utility  function  and  hence  his  compensated 
demand  function  (in  principle)  using  the  system  of  partial  differential  equations 
generated  by  Roy’s  identity,10 


x{p,y)  = 


dU/dp 
dU/dy ' 


(3.4) 


Thus,  we  must  know  less  than  the  consumer’s  demand  function  if  we  are  to  justify 
the  use  of  an  approximation;  perhaps  only  its  local  properties.  However,  even  in 
this  case,  it  is  probably  preferable  to  construct  an  exact  measure  to  the  extent  of 
one’s  limited  knowledge  of  demand  characteristics  away  from  the  initial  equi- 
librium, and  use  confidence  bounds  based  on  the  precision  of  our  underlying 
parameter  estimates.  Alternatively,  one  can  use  revealed  preference  theory  in 
conjunction  with  observed  data  to  derive  bounds,  without  ever  specifying  a 
particular  demand  function  [Varian  (1982)]. 

A second  defense  of  the  use  of  approximations  or  even  of  simple  consumers’ 
surplus  measures  is  that  the  demand  function  as  estimated  is  not  integrable,  so 
that  we  cannot  use  the  procedure  suggested  above  to  derive  the  associated 


10See  Hausman  (1981a).  Vartia  (1983)  presents  a numerical  algorithm  for  generating  utility 
functions  from  demand  functions. 
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compensated  demand  function.  However,  lack  of  integrability  is  synonomous 
with  the  violation  of  the  laws  of  demand.  If  such  laws  are  violated,  what 
interpretation  can  we  give  any  measure  we  use? 


3.2.  Variations  in  producer  prices 

The  assumption  made  thus  far  in  this  section  that  producer  prices  are  fixed  is  a 
common  one  in  the  literature,  but  may  do  violence  to  our  representation  of  the 
actual  situation  prevailing  in  the  economy.  For  example,  we  know  that  a tax  on  a 
good  in  absolutely  fixed  supply  is  equivalent  to  a lump  sum  tax  and  therefore 
non-distortionary,  regardless  of  how  elastic  the  demand  for  the  good  is.  Our 
preliminary  examination  of  excess  burden  using  consumers’  surplus  in  Section  2 
suggested  that  the  excess  burden  of  a tax  is  proportional  to  the  reduction  in  the 
output  of  the  taxed  good,  taking  account  of  both  demand  and  supply  conditions. 
It  would  be  useful  to  extend  the  Hicksian  measures  in  the  same  direction. 

The  complication  that  arises  in  doing  so  is  that  it  is  no  longer  sufficient  to  posit 
a certain  money  value  of  compensation:  since  producer  prices  change,  the  form  of 
compensation  matters.  For  example,  to  extend  the  compensating  variation  mea- 
sure of  excess  burden,  we  must  specify  the  form  in  which  the  compensation  from 
“outside”  the  system,  in  excess  of  collected  revenue,  will  come. 

To  develop  a compensating  variation  measure  of  the  additional  excess  burden 
caused  by  an  increase  in  taxes,  starting  at  a distorted  equilibrium,  we  let  a be  the 
compensation  vector  of  the  elements  of  x,  and  /?  the  scalar  that  determines  how 
much  of  the  compensation  bundle  the  consumer  receives,  fia.  If  we  denote 
producer  prices  by  q and  consumer  prices  by  p,  then  the  compensating  variation 
measure  of  excess  burden  j8  can  be  defined  implicitly  by  the  equation 


V{p2,  y2  + R2~  Ri  + q2~  aP)=  Vip^yJ, 


(3.5) 


where  p { is  the  initial  consumer  price  vector,  p2  the  distorted  price  vector,  qx  and 
q2  the  corresponding  producer  price  vectors,  yx  and  y2  the  lump  sum  income  in 
the  two  states,  and  Rx  = (px  - qx)  ■ x(px,  yx)  and  R2  = (p2  - q2)  ‘Xc(Pi< 
V(Pi,  yi))  the  revenue  in  the  two  states.  The  values  of  y are  indexed  by  their 
respective  states  because  they  may  vary  when  producer  prices  change.  For 
example,  if  the  economy’s  production  function  exhibits  decreasing  returns  to  scale 
in  the  consumer  goods  x,  then  the  pure  profits  from  competitive  production  are 
positive  and  change  with  the  change  in  producer  prices.  Letting  z be  the  vector  of 
goods  produced  (negative  for  net  factor  inputs),  total  profits  are  y=qz ■ Note 
that  production  and  consumption  differ  by  the  infusion  of  additional  compensa- 
tion, fia. 
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Expression  (3.5)  can  be  transformed  into  another  that  is  similar  to  those  of  the 
previous  section.  Using  the  fact  that  UA—  UB->  E(p,UA)=  E(p,UB),  and  that 
E(p,  U(p,  y))  = y,  we  obtain 

q2-aP=  E(p2,V(pl,yl))-y2-(R2- 

= [E{p2,V(Pl,  yx))  - E(Pl,V{Pl,  yi))\  +(y1-y2)-(R2- RJ. 

(3.6) 

Compared  to  (2.12),  there  is  a new  term,  (yx  — y2),  representing  the  reduction  in 
profit  between  states  1 and  2.  Thus,  there  are  now  three  terms  in  the  expression 
for  excess  burden,  representing  the  changes  in  consumers’,  producers’  and  govern- 
ment surplus,  as  in  the  simple,  Marshallian  example  depicted  in  Figure  2.3. 

This  expression  for  excess  burden  also  differs  in  that  it  is  not  actually  a solution 
for  p.  It  will  hold  regardless  of  the  choice  of  a,  though  the  solution  for  p depends 
on  this  choice.  This  dependence  can  be  demonstrated  by  considering  the  second- 
order  approximation  for  P, 


M 

dr 


A t + 4 Af'^  Ar, 
dr2 


(3.7) 


evaluated  at  the  initial  point  1.  Total  differentiation  of  (3.5)  yields 


dV 

dp 


dp  + 


dV 
d y 


dq  + d@ a ■ q + t ■ dx  + x ■ dt 


= 0, 


(3.8) 


where  t = {p  — q). 

Again  using  the  envelope  theorem,  one  can  show  that  dy/dq  = z.  Using  this 
and  Roy’s  identity  [(3.4)  above],  we  obtain  from  (3.8) 

dV 

-jy[  — x ■ dp  + z ■ dq  + Pa  ■ dq  + d/8  a ■ q + t ■ d*  + x ■ dr]  = 0.  (3.9) 


But  since  x = z + pa  and  dV/dy  # 0,  (3.9)  simplifies  to 


q2  ■ adp  ~ — r • dx, 


(3.10) 


which  is  precisely  the  form  of  the  first-order  effect  derived  above  in  (3.3). 

We  derive  the  second-order  term  by  totally  differentiating  (3.10).  This  yields 

q2  ■ ad2p  — -dr  • dx  - dPa  ■ dq  — t ■ d2x,  (3.11) 

which,  even  if  one  ignores  the  last  curvature  term,  has  an  additional  term 
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compared  to  the  second-order  effect  in  (3.3),  caused  by  the  changing  value  of  the 
compensation  bundle.  This  may  be  seen  by  substituting  (3.10)  and  (3.11)  into 
(3.7)  to  obtain 


q2  ■ ot/8  ~ — (t'  Ax  + \ At' Ax  + j/3a  ■ Aq),  (3.12) 

where  the  right-hand  side  of  (3.12)  includes  the  first-order  approximations 
(dx/dt)At  for  Ax,  (dq/dt)At  for  Aq,  and  (dji/dt)At  for  fi.  Only  in  the  case 
that  all  compensation  is  in  the  form  of  the  numeraire  commodity  will  (3.12) 
reduce  to  (3.3).11 

This  extra  term  may  be  represented  graphically  by  considering  the  exact 
measure  (3.6)  for  the  case  in  which  there  are  two  goods,  one  of  which  is  taxed. 
This  is  done  in  Figure  3.1.  Let  the  untaxed  good  serve  as  numeraire,  so  that  its 
price  does  not  change.  The  supply  curve  S shows  the  increasing  relative  producer 
price,  q,  of  the  taxed  good  as  its  production  increases.  The  ordinary  demand 
curve  D represents  the  consumer’s  preference,  given  income  yv  With  an  initial 
tax  of  (pl  — qx),  the  initial  equilibrium  consumption  is  at  xx,  where  the  supply 
curve  Sl  is  that  facing  the  consumer. 

As  the  tax  is  increased  further,  we  assume  the  individual  is  maintained  on  the 
same  indifference  curve,  so  that  demand  for  x is  described  by  the  compensated 
demand  curve  passing  through  the  initial  point.  The  supply  curve  facing  the 
consumer  now  depends  on  the  form  the  compensation  takes.  If  some  of  the  taxed 
good  is  included  in  a,  then  the  supply  to  the  consumer  is  described  by  curve  S2, 
rather  than  S2,  since  total  supply  will  exceed  production.  This  leads  to  consump- 
tion at  x2,  and  production  at  z2,  rather  than  the  single  value  in  between  that 
would  obtain  if  all  compensation  were  in  the  form  of  the  numeraire  commodity. 

Consider  now  the  three  terms  in  expression  (3.6).  All  may  be  represented  in 
Figure  3.1.  The  first,  as  before,  is  the  area  to  the  left  of  the  compensated  demand 
curve  between  py  and  p2.  Since  dy  = z,  dq,  the  second  term  in  (3.6)  equals  the 
area  to  the  left  of  the  supply  curve  S between  qx  and  q2.  Finally,  Ry  and  R2 
equal  in  area  the  rectangles  defined  by  pv  qx  and  xx,  and  p2,  q2  and  x2, 
respectively.  The  resulting  area  for  q2  ■ aji  is  the  usual  trapezoid  defined  by  the 
supply  curve,  the  compensated  demand  curve,  xx  and  x2  (shaded  in  Figure  3.1), 
less  that  of  the  triangle  defined  by  the  producers’  supply  curve  S,  the  social 
supply  curve  S',  and  prices  qt  and  q2  (cross-hatched  in  Figure  3.1).  This  new 
piece  has  an  area  approximately  equal  to  \{qx  — q2){x2  — z2)  or,  since  x = z + fia 
and  only  this  good’s  price  changes,  — \ ■ Aq. 

Another  familiar  expression  for  the  second-order  effect  may  be  derived  from 
(3.11).  Again  ignoring  the  last  curvature  term,  we  use  the  fact  that  x = /3a  + z to 

11  In  deriving  a similar  measure,  Diamond  and  McFadden  (1974)  made  this  assumption. 
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Dc(V(P„y,)) 


obtain 


d2/8  = — dp  ■ dx  + dq  ■ dz  = -dp'Sdp  + dq'Hdq,  (3.13) 

where  H is  the  Hessian  of  the  profit  function  d2y/dq 2 = dz/d<jr. 

This  expression  for  the  second-order  effect  of  a change  in  taxes  on  welfare  was 
first  developed  by  Boiteux  (1951),  although  his  derivation  was  limited  to  the  case 
where  the  initial  equilibrium  is  undistorted  and  the  first-order  effect  d/8  vanishes. 

Using  the  notion  of  equivalent  variation,  we  can  construct  a measure  by  asking 
what  level  of  resources  can  be  extracted  from  the  consumer  in  excess  of  additional 
revenue  to  avoid  an  additional  tax  increase.  This  yields  the  following  implicit 
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V(Pi, yl-{R1-Rl)-ql-ap), 


(3.14) 


where,  in  this  case,  state  2 is  the  actual  state  with  taxes  at  t2,  whereas  state  1 is  the 
hypothetical  state  in  which  taxes  do  not  rise  from  t-y  but  income  is  reduced  to 
yield  the  same  level  of  utility  as  prevails  in  state  2.  Here,  (1  — /})  is  related  to 
Debreu’s  (1951)  coefficient  of  resource  utilization,  which  he  defines  to  be  the 
proportion  of  society’s  resources  that  would  be  necessary  to  maintain  each 
individual’s  current  level  of  utility  if  all  distortions  were  removed.  Our  measure 
differs  in  that  we  consider  the  marginal  change,  rather  than  removal  of  a 
distortion,  and  let  the  vector  a be  arbitrary.  (Of  course,  Debreu’s  measure  is 
defined  relative  to  all  kinds  of  distortions  leading  to  an  inefficient  allocation,  not 
just  tax-induced  changes  in  the  prices  of  consumer  goods.)  As  before,  we  cannot 
solve  for  /?  explicitly,  but  we  can  calculate  the  first-order  and  second-order  effects 
d/3  and  d2/3  at  the  initial  distorted  point.  We  leave  further  discussion  of  this 
measure  to  the  next  subsection,  which  deals  with  aggregation  over  consumers. 


3.3.  Aggregation  and  welfare  comparisons 

Thus  far,  we  have  defined  all  our  measures  of  excess  burden  for  the  case  of  a 
single  individual.  They  are  easily  generalized  to  the  case  of  several  identical 
individuals.  However,  matters  become  more  complicated  if  we  wish  to  allow  for 
differences  in  individual  tastes,  or  even  differences  in  income  among  otherwise 
identical  individuals. 

Except  under  very  strict  conditions  on  preferences,  any  measure  of  aggregate 
excess  burden  will  depend  on  the  initial  distribution  of  income.  Consider  the  case 
of  fixed  producer  prices  examined  in  Section  2,  and  define  a measure  of  aggregate 
excess  burden,  using  the  compensating  variation,  as  the  amount  that  must  come 
from  outside  the  system  to  maintain  each  consumer  at  his  pre-tax  level  of  utility. 
For  two  individuals,  this  measure  equals  [compare  to  (2.10)] 

L=  E'{Pi,Vl{Po,yl))  + E2{pl,V1(p0,  y2))-(yl  + y2) 

-(Pi-P0)-{xc{PuVl(p0,yl))-yx2c(Pl,V2{p0,y2))),  (3.15) 


where  superscripts  index  the  consumers  1 and  2. 
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Suppose  now  that  the  initial  income  distribution  is  changed  by  a small 
reduction  in  yl  and  an  equal  size  increase  in  y2.  The  change  in  L would  be 

9 E1  dV1  9 E1  dV2 

dL~  dU  ' dy  +l+  dU  ' dy 


1 -(Pi~Po)- 


9jCc  dV1  dx£  dV2  \ 
dU  ' dy  + dU  ' 9 y ’ 


(3.16) 


which,  using  the  fact  that  xc(p1,  V(p0,  y))  = x(p1,  E(pu  V(p0,  y))),  can  be 
rewritten  as 

a e1  dv1  a e2  dv2 
dL~  du  ' dy  + at/  ' ay 


. x / a*1  a^1  3K1  , ax2  a e2  ac2\  f . 

Kpi  dy  ' dU  ' dy  + dy  ' W ' dy  )’  ^3‘17' 

Since  E{p0,  V( p0,  y))  = y,  we  may  rewrite  (3.17)  as 

+ (3-18) 


where 


pl  = %-{Po’Vi{Po’y‘))  and  fii=^j(pl,Vi(p0,yi)) 

are  the  marginal  expenditures  needed  per  unit  of  increased  utility  at  base  utility 
level  V‘(p0,  y‘)  and  price  levels  p0  and  p1,  respectively.  Thus,  d L will  equal 
zero,  in  general,  only  if  two  conditions  are  met: 

1)  jx'/p'  equals  some  common  function  of  prices  alone  (not  income)  for  the  two 
individuals;  and 

2)  the  vector  of  income  effects  dx'/dy  equals  some  common  function  of  prices 
alone. 

Condition  2)  implies  that  ordinary  demand  functions  take  the  form 
*'(#>./)  = <t>'(p)  + 9(p)yi,  (3.19) 

for  some  functions  <#>'(•)  and  8(  ■ ),  the  latter  common  across  individuals.  [The  laws 
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of  consumer  demand  imply,  in  turn,  that  £'(•)  is  homogeneous  of  degree  0 in 
prices  and  $(•)  is  homogeneous  of  degree  —1  in  prices,  since  a proportional 
change  in  p and  y cannot  affect  *'(•)•]  The  demand  function  specified  in  (3.19) 
corresponds  to  the  well-known  Gorman  (1953)  “polar  form”,  which  plays  a 
central  role  in  the  theory  of  exact  aggregation. 

Condition  1)  implies  that,  for  suitable  transformation  of  the  utility  function, 
consumer  i ’s  expenditure  function  can  be  written 


E\p,  Ui)  = 8i(p)+y(p)-Ui,  (3.20) 

[with  S'(-)  and  y(-)  homogeneous  of  degree  1 in  prices].  This  is  the  expenditure 
function  corresponding  to  the  Gorman  polar  form  [see  Muellbauer  (1976)],  so 
that  conditions  1)  and  2)  are  each  satisfied  if  and  only  if  preferences  satisfy  this 
very  restricted  pattern  that  allows  variations  from  identical  homothetic  prefer- 
ences only  through  individual-specific  displacements  through  the  “basic  needs” 
function  of  zero-income  consumption,  <£>'(')- 

Note  that  even  identical  preferences,  unless  homothetic,  will  not  suffice.  For 
example,  suppose  individuals  have  a price-inelastic  compensated  demand  for  a 
commodity  at  high  incomes  but  an  elastic  demand  at  low  incomes.  Then  the 
excess  burden  of  a tax  on  this  good  will  be  increased  if  we  transfer  income  to  the 
poorer  individual,  for  this  will  increase  the  overall  demand  elasticity  for  the  taxed 
good.  Thus,  any  measure  of  excess  burden  we  envisage  is  not  independent  of  the 
income  distribution.  Similarly,  if  we  required  not  that  each  individual’s  utility  be 
kept  constant,  but  that  individual  1 receive  one  dollar  less  than  would  be 
necessary,  this,  too,  would  affect  the  aggregate  measure  for  the  same  reason. 

Of  course,  it  is  still  possible  to  define  measures  of  excess  burden  for  the 
multi-individual  case,  given  the  initial  resource  distribution.  For  example,  we  may 
implicitly  define  a compensating  variation  measure  analogous  to  (3.5)  by  the 
identities 

Vi{P2,u,{y2  + R2-Rl+q2-aP))=V,{pl,uiyl),  V/,  (3.21) 

where  i indexes  the  individual,  w'  is  individual  i ’s  actual  profit  share,  and  w'  is 
the  share  needed  to  maintain  each  individual  on  the  same  indifference  curve  as 
prices  rise  to  p2  and  the  extra  compensation  vector  a ■ ft  “enters”  the  system.  For 
the  equivalent  variation,  the  measure  for  (i  corresponding  to  (3.14)  for  several 
individuals  is 

f/'(/,2>«'T2)=  V‘(Puui(yi~  Ri  + Ri~  <h-  (3-22) 

Again,  it  is  not  generally  possible  to  solve  explicitly  for  j3  in  either  case,  but  we 
can  derive  expressions  for  the  first-order  and  second-order  effects  dfi  and  d 2(i  by 
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totally  differentiating  (3.21)  or  (3.22)  for  each  i and  then  adding  over  i,  making 
use  of  the  adding-up  constraint  on  the  profit  shares  w.  While  the  resulting 
expressions  for  the  compensating  variation  measure  are  essentially  the  same  as 
those  described  in  Section  3.2  (with  aggregate  demands  replacing  individual 
ones),  an  interesting  result  occurs  in  the  second-order  effect  derived  from  the 
measure  defined  by  (3.22).  It  contains  an  additional  term  reflecting  the  indirect 
impact  of  taxes  on  excess  burden  through  the  change  in  the  income  distribution  in 
state  1 [Debreu  (1954)].  Since  for  an  equivalent  variation  measure  state  1 is  simply 
a hypothetical  state  based  on  the  utility  levels  in  state  2,  changes  in  taxes,  even 
starting  at  a no- tax  position,  influence  the  distribution  of  real  income  in  state  1. 
Indeed,  it  should  not  be  surprising  that  the  condition  required  for  this  extra  term 
to  vanish  is  the  same  one  required  above  for  excess  burden  to  be  independent  of 
the  initial  income  distribution. 

There  is  a temptation  to  respond  to  this  dependency  of  excess  burden  on  the 
distribution  of  income  by  conceptually  separating  questions  of  allocation  and 
distribution,  following  Musgrave’s  (1959)  framework  for  the  different  “branches” 
of  government:  let  the  distribution  branch  worry  about  distribution,  and  the 
allocation  branch  concern  itself  with  minimizing  excess  burden.  However,  there 
are  two  problems  with  this  approach.  First,  if  the  distribution  branch  is  not  in 
operation,  we  cannot  obtain  well-behaved  social  welfare  prescriptions  by  compar- 
ing levels  of  excess  burden  in  different  allocations  through  the  device  known  as 
the  compensation  principle:  one  state  being  preferred  to  another  if  winners  could 
compensate  losers.  Unless  such  compensation  actually  occurs,  the  orderings 
coming  out  of  such  a procedure  need  not  be  well-behaved  or  consistent  with  any 
particular  social  welfare  function.  This  is  the  essence  of  the  critique  of  the  Hicks 
(1940)-Kaldor  (1939)  approach  to  welfare  economics  [Samuelson  (1947)]. 

A second  response  might  be  that  we  are  only  interested  in  efficiency,  not 
distribution,  and  so  will  assign  equal  distributional  weights  to  individuals,  thereby 
allowing  the  interpretation  of  the  aggregate  measures  derived  above  as 
“efficiency-only”  social  welfare  measures.  Such  is  the  approach  suggested  by 
Harberger  (1971).  Unfortunately,  this  will  not  work  either.  We  can  certainly 
imagine  a social  welfare  function  of  the  form 

H 

w(U\...,UH)=  £ U‘,  (3.23) 

i=i 

and  can  even  choose  a normalization  for  the  individual  utility  functions  so  that,  in 
the  initial  state,  the  marginal  utility  of  income  and  hence  the  social  marginal 
utility  of  income  for  each  individual  is  one  (“money  metric”  utility).  However, 
once  prices  change,  as  they  will  when  taxes  are  introduced,  the  changes  in  real 
income,  and  hence  the  marginal  utility  of  income,  will  generally  be  different. 


Ch.  2:  Excess  Burden  and  Optimal  Taxation 


83 


Thus,  for  our  measure  of  excess  burden  to  correspond  to  a social  welfare  function, 
it  would  require  price-dependent  individual  weights,  even  if  the  weights  were 
initially  equal.  Only  when  preferences  satisfy  the  Gorman  conditions  will  weights 
initially  set  equal  remain  equal  in  all  cases  [Roberts  (1980)].  Thus,  it  will  generally 
not  be  possible  to  make  welfare  comparisons  on  the  basis  of  aggregate  measures 
of  excess  burden,  no  matter  what  our  attitude  is  about  the  relative  importance  of 
equity  and  efficiency. 


4.  The  empirical  measurement  of  excess  burden 

The  ultimate  value  of  the  theory  developed  in  Sections  2 and  3 is  in  its  application 
to  measuring  real  world  distortions.  This  section  offers  a brief  review  of  some  of 
the  research  that  has  been  done  in  this  popular  area  of  investigation.  No  attempt 
will  be  made  to  provide  an  exhaustive  summary  of  the  empirical  literature  on  the 
measurement  of  excess  burden. 


4.1.  Measurement  with  Taylor  approximations 

The  earliest  empirical  work  on  the  measurement  of  excess  burden  was  done  by 
Harberger,  in  a series  of  papers.  In  each  case,  he  applied  a second-order  Taylor 
approximation  of  the  form  in  (3.3),  expanded  around  the  no-tax  point.  An 
example  of  this  research  may  be  found  in  Harberger  (1964),  which  considers  the 
welfare  cost  of  a progressive  tax  on  labor  income  by  individual  income  classes. 
Treating  capital  as  a factor  supplied  by  households  in  static  model,  Harberger 
(1966)  considered  the  deadweight  loss  from  the  production  distortion  caused  by 
differential  taxation  of  the  return  to  capital  in  the  corporate  and  non-corporate 
sectors.  Non-tax  distortions,  such  as  those  caused  by  monopolistic  pricing,  can 
also  be  analyzed  using  standard  excess  burden  formulae  [Harberger  (1954)].  One 
can  also  analyze  the  intertemporal  allocation  distortion  caused  by  capital  income 
taxes  by  thinking  of  consumption  in  different  periods  as  different  commodities 
[Feldstein  (1978)]. 

Aside  from  the  use  of  the  Taylor  approximation,  a weakness  typical  of  most  of 
this  early  work  (excluding,  of  course,  Harberger’s  piece  on  the  corporate  income 
tax)  was  the  assumption  of  fixed  producer  prices.  With  a convex  production 
frontier,  changes  in  production  prices  would  normally  act  to  lessen  the  excess 
burden  caused  by  a tax  increase.  An  example  of  the  sensitivity  of  this  assumption 
about  production  parameters  may  be  found  in  Chamley  (1981)  with  respect  to  the 
excess  burden  of  capital  income  taxation. 
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4.2.  Exact  measures 

As  stressed  in  Section  3,  there  is  rarely  a situation  in  which  Taylor  approxima- 
tions need  be  used  in  place  of  exact  measures  based  on  the  Hicksian  variations. 
This  point  is  stressed  by  a number  of  authors  [including  Auerbach  and  Rosen 
(1980)  and  Hausman  (1981a)].  For  many  systems  of  demand  functions  (such  as 
the  linear  expenditure  system  discussed  in  Section  6)  it  is  easy  to  recover  the 
parameters  of  the  expenditure  function  from  estimated  ordinary  demand  func- 
tions. Moreover,  one  can  also  use  the  standard  errors  of  such  estimates  to  place 
confidence  bounds  on  the  excess  burden  measures  themselves  [Hausman  (1981a)]. 
Several  recent  studies  have  used  exact  measures  to  calculate  the  excess  burden  of 
taxation.  For  example,  Rosen  (1978)  considered  the  excess  burden  of  wage 
taxation  using  a linear  expenditure  system  estimated  from  a cross-section. 

One  of  the  additional  benefits  of  the  “exact”  approach  to  measuring  deadweight 
loss  is  that  it  can  readily  be  generalized  to  allow  for  changes  in  income.  That  is, 
we  can  deduct  from  changes  in  the  expenditure  function  not  only  changes  in 
revenue,  but  changes  in  income,  to  calculate  the  excess  burden  of  a tax  system 
that  changes  individual  incomes  as  well  as  the  prices  of  some  commodities.  For 
example,  the  compensating  variation  measure  (2.10)  would  become 

EBC  = E(pl,  V(p0,  y0)) -Ti  ~(Pi  ~ Po) ' *c(Pnv{p0,  y0)),  (4.1) 

where  y0  is  income  in  the  undistorted  state  and  yl  is  income  in  the  distorted 
state.  This  tool  is  particularly  useful  for  the  analysis  of  progressive  taxes,  where 
individuals  behave  as  if  they  faced  a proportional  tax  equal  to  the  actual 
marginal  rate,  with  the  inframarginal  excess  in  collections  that  results  being 
subtracted  from  lump  sum  income.  For  example,  consider  the  case  of  a progres- 
sive labor  income  tax  in  a two-good  model.  The  individual’s  before-tax  and 
after-tax  budget  lines  are  represented  in  Figure  4.1.  If  the  individual  chooses 
point  A,  we  may  pretend  that  he  did  so  in  response  to  a proportional  tax  at  rate 
(w0  — wA)/w0  and  lump  sum  income  of  yA.  If  he  chooses  point  B,  we  could 
imagine  a proportional  tax  of  (w0  — wB)/w0  and  lump  sum  income  of  yB.  This 
technique  has  been  used  in  labor  supply  estimation  and  excess  burden  calculation 
by  Hausman  (1981b).  King  (1983b)  has  used  the  equivalent  variation  analogue  of 
(4.1),  which  he  calls  the  “equivalent  gain”,  to  evaluate  the  effects  of  changes  in 
housing  policy  in  the  U.K. 

An  additional  extension  possible  with  exact  measures  is  the  case  of  discrete 
choices,  such  as  the  decision  to  work  or  to  purchase  a durable  good.  Suppose 
there  are  two  regimes  among  which  a consumer  must  choose.  The  general 
methodology  for  calculating  excess  burden  is,  as  before,  to  equate  utility  changes 
from  distortionary  and  lump  sum  taxation,  and  compare  the  tax  revenue.  How- 
ever, the  changes  in  utility  take  account  of  switches  in  regime  that  may  occur  in 
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each  case.  This  is  a straightforward  calculation  when  the  consumer’s  indirect 
utility  function  is  known,  for  it  is  simple  to  identify  the  regime  chosen  in  any 
situation.  However,  if  one  wishes  to  use  approximation  formulae,  one  must  take 
explicit  account  of  the  effect  of  taxes  on  the  probability  of  switching  regimes.  [See 
Small  and  Rosen  (1981).]  An  example  of  excess  burden  calculations  with  discrete 
decision  variables  is  the  analysis  of  housing  subsidy  programs  by  Venti  and  Wise 
(1984),  in  which  individuals  must  decide  whether  to  move  or  stay,  and  face 
different  budget  constraints  in  the  two  situations. 


4.3.  Simulation  methods 

Ultimately,  there  are  limitations  on  the  extent  to  which  we  can  obtain  closed  form 
solutions  for  excess  burden.  This  is  particularly  true  of  general  equilibrium 
calculations,  for  we  must  solve  explicitly  for  the  changes  in  producer  prices 
consistent  with  changes  in  consumer  behavior.  A solution  to  this  problem  is  the 
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simulation  model,  in  which  explicit  parameterizations  of  preferences  and  technol- 
ogy are  made  and  actual  equilibria  calculated.  It  is  then  straightforward  to 
estimate  changes  in  utility  caused  by  a change  in  tax  regime,  or  the  resources  one 
could  extract  or  must  add  to  compensate  for  a given  change.  The  latter  type  of 
calculation  corresponds  to  the  price-varying  excess  burden  measures  cited  in 
Section  3.  The  use  of  disaggregated,  static  general  equilibrium  models  to  analyze 
the  effects  of  taxation  has  now  become  rather  common.  An  early  example  of  the 
use  of  simulation  technique  is  Shoven’s  (1976)  reconsideration  of  the  excess 
burden  caused  by  the  corporate  income  tax.  For  other  applications,  see  the 
contributions  in  Feldstein  (1983).  In  more  recent  work,  Auerbach,  Kotlikoff  and 
Skinner  (1983)  use  a perfect-foresight,  overlapping-generations  growth  model  to 
analyze  the  effects  on  different  cohorts  of  individuals  of  various  dynamic  tax 
changes,  such  as  an  unannounced  switch  from  income  taxation  to  consumption 
taxation. 


5.  The  theory  of  optimal  taxation 

Taxes  distort  behavior  and  cause  excess  burden.  How  can  this  excess  burden  be 
kept  to  a minimum  while  government  simultaneously  raises  the  revenue  it  requires 
for  public  expenditures?  This  is  the  optimal  tax  problem,  solved  in  its  basic  form 
by  Ramsey  (1927). 

Of  course,  there  do  exist  non-distortionary  taxes,  at  least  hypothetically.  Taxes 
on  pure  profits  are  just  one  form  of  such  taxation.  The  optimal  tax  problem,  in  a 
sense,  embodies  the  concession  that  such  ideal  taxes  may  be  difficult  to  institute 
in  practice.  One  might  cite  a number  of  reasons  for  this,  including  the  political 
constraints  on  non-uniform  taxation  dependent  on  personal  characteristics.  For 
example,  we  might  succeed  in  having  a non-distortionary  and  progressive  tax 
system  by  taxing  according  to  genetic  characteristics  associated  with  ability,  but 
such  schemes  are  typically  proscribed.  In  addition,  it  may  be  impossible  to 
observe  such  characteristics. 

In  the  next  subsection,  we  present  and  interpret  the  basic,  single-individual 
optimal  tax  results,  paying  particular  attention  to  the  role  of  the  “untaxed” 
numeraire  commodity  that  is  often  a confusing  part  of  such  analysis.  Section  5.2 
discusses  the  relationship  of  the  optimal  tax  solution  to  the  measures  of  excess 
burden  described  above.  In  Sections  5.3  and  5.4,  we  show  how  the  results  can  be 
extended  to  allow  for  profits  and  changing  producer  prices,  and  interpret  the 
classic  results  of  Diamond  and  Mirlees  (1971)  and  Stiglitz  and  Dasgupta  (1971) 
concerning  the  desirability  of  production  efficiency  in  the  presence  of  distor- 
tionary commodity  taxes. 
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We  imagine  a representative  consumer  who  has  exogenous  income  y,  and  faces 
consumer  prices  p = {p0,  pY, . . . , pN)  for  the  commodities  0, 1, . . . , N,  which  have 
fixed  producer  prices  q = {q(i,ql,...,  qn).  Without  any  loss  of  generality,  we  may 
choose  good  zero  as  the  numeraire  and  set  q0  = 1. 

The  government  may  use  unit  excise  taxes  t = (t0,t1,...,tN)  on  the  goods 
0, 1, . . . , N,  to  raise  a certain  amount  of  required  revenue,  R.  (We  will  relax  this 
ignorance  of  the  expenditure  side  below.)  Assuming  the  consumer  maximizes 
utility  t/(x)  in  the  goods  .v,  subject  to  the  prices  p and  income  y,  we  may  express 
the  optimal  tax  problem  by 


max  max  U(x)  subject  to  p ■ x =y 

p l X 


subject  to  (p  — q)  x = R, 


(5.1) 


or,  using  the  definition  of  the  indirect  utility  function  V(-), 

maxF(p,y)  subject  to  ( p-q)  x=R . (5.2) 

p 

Note  that  we  specify  the  price  vector,  p,  as  our  control  rather  than  t , but  this  is  a 
trivial  distinction  when  the  social  cost  vector  q is  fixed  since  dr/d p = /,  the 
identity  matrix  of  order  N + 1. 

The  first-order  conditions  for  the  Lagrangian 

v(p,y)-p[R-(p-q)-x]  (5-3) 


are 


— Ax,  + p 


^W,+Xl 


= 0,  V/, 


(5.4) 


where  A ^ dV/dy  is  the  consumer’s  marginal  utility  of  income.  Condition  (5.4) 
may  be  rearranged  in  a number  of  ways.  Perhaps  the  most  useful  involves 
splitting  the  cross-price  effects  9xy/9p,  using  the  Slutsky  equation,  and  defining 

« = a + mI Jj-fy 


(5.5) 
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to  be  the  marginal  social  utility  of  income  [Diamond  (1975)],  to  obtain 

-EVj-(~K  «•  <5-6> 

where  the  S', -s  are  components  of  the  Slutsky  matrix  S.  The  term  a differs  from  A 
because,  in  the  presence  of  excise  taxes,  a dollar  given  to  the  individual  increases 
his  utility  directly  by  A and  indirectly  by  the  increased  revenue  resulting  from 
additional  expenditure.  Since  we  can  interpret  the  Lagrange  multiplier  of  the 
revenue  constraint,  p,  as  the  shadow  cost  in  terms  of  utility  of  raising  an 
additional  dollar  of  revenue,  the  indirect  gain  of  revenue  added  by  increased 
expenditures  out  of  an  additional  dollar  of  income  equals  pT,jtj(dXj/dy),  the 
second  term  in  the  definition  of  a. 

The  term  (ju  - a)  represents  the  difference  between  raising  a dollar  of  revenue 
at  the  actual  margin  and  raising  it  through  a direct  taking  of  income  from  the 
consumer:  the  marginal  excess  burden  of  the  tax.  This  term  is  always  non-nega- 
tive [see  expression  (7.8)]  and  hence  the  terms  Y.SiJtj  are  also  non-negative. 

There  is  one  potential  solution  to  (5.6)  that  would  be  particularly  attractive,  for 
it  involves  no  distortion.  If  we  choose  equal  proportional  ad  valorem  taxes,  or 

t,  = 0Pi,  Vi,  (5-7) 

for  some  constant  8,  we  obtain 


Vi.  (5,8) 

But  LSijPj  equals  (l/\)(dU/dpl)\u  = 0 for  all  /.  (This  is  simply  a statement  of 
the  envelope  theorem.)  Therefore,  the  system  of  equations  in  (5.8)  is  satisfied  for 
ft , = a and  hence  no  excess  burden.  Thus,  proportional  excise  taxes  would  appear 
to  be  the  solution. 

The  reason  such  taxes  are  non-distortionary,  however,  is  the  key  to  their  limited 
applicability.  Since  p = q + t = q + 0p,  p = q/(  1 - 0).  Hence,  the  consumer’s 
budget  constraint  becomes 

iZTfj'* c=y  or  9 x=y(l  ~ 8),  (5-9) 

where  8 is  chosen  to  satisfy  8 = R/y.  A system  of  equal  excise  taxes  is  nothing 
more  than  a tax  on  the  consumer’s  exogenous  income,  and  hence  a lump  sum  tax. 
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If  y = 0,  then  no  finite  value  of  8 will  satisfy  the  revenue  constraint,  so  we  must 
ask  when  y will  be  non-zero. 

First  of  all,  y will  be  non-zero  in  general  if  there  are  decreasing  returns  to  scale 
in  production  (in  a more  general  model  not  assuming  fixed  producer  prices).  Even 
in  the  absence  of  pure  profits,  y will  be  non-zero  if  we  interpret  it  as  “full 
income”  and  the  x vector  as  consumption  rather  than  demand.  For  example, 
suppose  the  jc  vector  consists  of  two  commodities,  consumption  C and  leisure  /, 
and  that  the  consumer  has  a labor  endowment  L.  Without  pure  profits,  the 
consumer’s  budget  constraint  in  the  absence  of  taxes  may  be  written  either  as 


<7Cc  + (/-L)  = 0, 

(5.10a) 

qCc  + / — L y 

(5.10b) 

where  labor  is  the  numeraire  and  C and  qc  are  the  amount  and  relative  price  of 
consumption.  Interpreting  the  labor  commodity  we  can  tax  as  net  purchase  of 
leisure  (/-  L),  we  have  no  income  y to  tax  through  proportional  excise  taxes. 
Interpreting  the  commodity  as  consumption  of  leisure,  /,  we  can  use  the  propor- 
tional tax  solution  on  C and  / to  tax  L indirectly.  Hence,  the  inability  to  use 
proportional  taxes  to  raise  revenue  is  equivalent  to  restriction  of  taxing  only 
explicit  purchases,  rather  than  total  consumption.  Under  this  restriction,  a 
proportional  tax  raises  no  revenue  [Baumol  and  Bradford  (1970)].  Based  on 
examples  of  this  sort,  various  authors  have  equated  the  need  to  use  distortionary 
taxes  with  the  inability  to  tax  leisure,  but  this  is  somewhat  misleading  on  two 
counts:  we  can  tax  leisure  purchases  (labor  supply),  and  this  restriction  applies  to 
any  commodity  in  which  the  consumer  has  an  endowment. 

Once  we  do  restrict  our  taxes  to  net  purchases,  it  is  easiest  to  interpret  the 
vector  x to  be  such  flows  rather  than  total  consumption.  In  exchange  for  the  loss 
of  non-distortionary  tax  scheme,  we  gain  an  additional  free  normalization.  Since 
the  consumer’s  indirect  utility  function  is  homogeneous  in  prices  and  income,  and 
is  now  simply  V(p),  it  is  also  homogeneous  of  degree  zero  in  prices.  So  is  the 
revenue  constraint:  since  p ■ x = 0,  it  follows  that  for  any  constant  tj>, 

(4>p-q)-x=($-l)p-x  + (p-q)-x=(p-q)-x.  (5.11) 

Thus,  we  may  choose  any  scale  for  p.  It  is  customary  to  set  p0=l,  thereby 
making  the  numeraire  also  the  arbitrarily  “ untaxed”  good.  Typically,  in  models 
where  there  is  a single  factor  supplied,  labor,  and  several  commodities  purchased, 
labor  is  chosen  as  this  numeraire.  While  such  a normalization  is  innocuous  and  in 
no  way  affects  the  real  characteristics  of  the  outcome,  it  can  be  very  confusing: 


90 


A Ian  J.  A uerbach 


the  un taxed  good,  labor,  just  happens  to  be  the  only  good  with  an  endowment,  L, 
that  we  cannot  tax  independently  of  its  consumption,  /;  hence  the  loss  of 
distinction  between  untaxable  and  untaxed  goods.  If  we  chose  com  as  the 
untaxed  good,  labor  would  still  have  an  untaxable  endowment.  This  distinction  is 
important  when  one  interprets  the  various  rules  derived  below. 

We  now  have  only  N first-order  conditions,  from  (5.6),  having  dropped  that 
corresponding  to  p0.  Hence,  the  strategy  of  equal  proportional  taxes  at  rate  6 
(with  a zero  tax  on  good  zero,  of  course)  now  gives  us  the  terms 

j*  0 


on  the  left-hand  side  of  (5.6).  This  will  stand  in  constant  proportion  to  xi  over  i, 
as  required  for  a solution,  only  if  the  compensated  cross-elasticity  of  demand  for 
each  good  i with  respect  to  the  price  of  good  0,  ei0  = Sl0  ■ p0/xt  = Si0/xi,  is  the 
same  for  all  i # 0.  Thus,  equal  proportional  taxes  on  all  taxed  goods  satisfy  the 
first-order  conditions  only  if  all  goods  are  equally  complementary  [in  the  sense  of 
Hicks  (1946)]  to  the  untaxed  good.  Naturally,  if  these  conditions  are  satisfied  for 
a given  choice  of  untaxed  good,  they  will  not  generally  work  for  another. 

Our  analysis  of  (5.6)  has  now  generally  ruled  out  uniform  taxation.  But  how 
should  the  taxes  diverge  from  uniformity?  Note  that  the  N conditions  in  (5.6)  can 
be  stacked  to  yield 


Si=  - 


li  — a 
P 


(5.13) 


where  S is  the  Slutsky  matrix  excluding  good  zero  and  t = (tv...,  tN).  Although 
there  is  no  independent  condition  with  respect  to  the  tax  on  good  zero  (which  has 
been  normalized  to  zero),  it  is  helpful  to  note  that  these  N conditions  imply  that 
(5.6)  also  holds  for  good  zero.  This  may  be  shown  as  follows.  Adding  a term 
multiplied  by  t0  to  each  of  the  N first-order  conditions  in  (5.6)  has  no  effect, 
since  t0  = 0.  Thus,12 

N N I N \ N N 

12  Sodi  = X)  ( ~ 12  Pk^ki  Hi  ~ ~ 12  PkY^  Skdi 

i = 0 i = 0 \ hi  / * = 1 1 = 0 

(514) 


l2This  uses  the  facts  that  Pk^ki  = 0 and  p x = 0. 
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St=  - 


(5.15) 


Suppose  that  the  government  is  currently  raising  its  revenue  through  lump  sum 
taxes,  and  must  now  shift  over  some  of  the  revenue  collection  to  distortionary 
taxes.  From  above,  we  know  that  there  is  no  first-order  effect  on  utility  of 
introducing  distortionary  taxes  from  a Pareto  optimum,  so  that  the  effects  on 
demand  of  this  small  change  in  prices  will  be  compensated  effects.  Thus,  to  a 
first-order  Taylor  approximation,  the  reduction  in  the  demand  for  good  i will  be 

-Ax,.  = - £S(,  Ap,  = - Lsutj,  (5.16) 

i j 


so  that  (5.15)  calls  for  an  equiproportional  reduction  in  demand  for  each  good.  As 
suggested  by  Dixit  (1970),  this  makes  intuitive  sense  in  light  of  the  excess  burden 
formulae  calculated  above.  From  (3.3),  the  introduction  of  small  taxes  t starting 
from  a Pareto  optimum  induces  an  excess  burden  of  approximately 

L = iL  At,  Ax,  = §£?,.  Ax,,  (5.17) 


so  that  each  small  tax  ti  will  induce  an  excess  burden  proportional  to  Ax,.  On  the 
other  hand,  the  revenue  raised  by  such  a tax  is  t,x,.  Thus,  holding  Ax,/x, 
constant  across  goods  results  in  a constant  ratio  of  excess  burden  to  a revenue  for 
each  tax.  This  is  precisely  the  sort  of  marginal  condition  one  would  expect  from 
minimizing  total  excess  burden  subject  to  a revenue  constraint. 

The  actual  taxes  that  lead  to  the  achievement  of  (5.13)  and  (5.15)  may  be 
obtained  by  inverting  S and  multiplying  both  sides  of  (5.13)  by  S 1 to  obtain 


t = 


(5.18) 


This  yields  no  neat  general  expressions  for  i,  though  for  various  special  cases  one 
can  go  a little  further. 

If  there  are  only  three  goods,  two  taxed,  then  (5.18)  yields  the  two  equations 
ti  = j(tjrY-S22Xi  + SnX2),  (5.19a) 

h = ~ j(,5'2i;ici  — 5’11x2), 


(5.19b) 
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where  A = SnS22  - Si2S21,  which  must  be  >0  because  S'  is  negative  semi- 
definite.  Since  S,0  + PiSn  + p2Si2  = 0 for  i = 1, 2,  we  may  divide  (5.19a)  by  (5.19b) 
and  substitute  to  obtain 


_ ( *^20  Pl^ll  ) ^12*^2 

tx  _ Pi 


„ (^10  + P2^n)X2  + 
Pi 


(5.20) 


or,  defining  0,  = ti/pl  and  dividing  the  numerator  and  denominator  of  the 
right-hand  side  of  (5.20)  by  xxx2 , we  obtain  [Corlett  and  Hague  (1953-54)  and 
Harberger  (1964)] 

^1  _ £20  + g21  + e12  (5  21) 
^2  £10  "h  e21  £12 

where,  as  before,  e,7  is  the  compensated  cross-elasticity  S^ipj/x^.  As  we 
discovered  above,  0,  = 02  is  an  optimal  solution  only  if  the  cross-elasticities  e10 
and  e20  are  equal. 

Because  A > 0,  expression  (5.21)  calls  for  a higher  tax  on  the  taxed  good  that  is 
the  relative  complement  to  the  numeraire  (e,0  is  smaller).  This  has  generated  the 
somewhat  misleading  explanation  that  we  “cannot”  tax  good  zero,  so  we  mini- 
mize distortions  hy  taxing  more  heavily  its  relative  complement.  Recall  that  the 
choice  of  untaxed  good  is  arbitrary,  and  that  (5.21)  applies  for  any  numbering  of 
the  three  goods. 

For  a larger  number  of  commodities,  a simple  result  obtains  if  we  assume  that 
the  matrix  S is  diagonal:  all  cross-effects  except  with  respect  to  good  zero  are 
zero.  Since  'll /Sl/pJ  = 0,  this  implies  that,  for  / = !,...,  N, 


suPi  + si0  = o. 


(5.22) 


Thus,  this  restriction  does  depend  on  the  choice  of  untaxed  commodity.  With 
such  a simplification,  (5.18)  yields  the  expressions 


t-  = 


J_ / 

M m M, 


or  0 . 


(5.23) 


This  is  the  celebrated  “inverse  elasticity”  rule  that  calls  for  higher  proportional 
taxes  on  goods  with  relatively  low  own-price  elasticities.  By  (5.22),  this  rule  is 
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(5.24) 


as  derived  above  for  the  three-good  case. 

Since  the  inverse  elasticity  rule  results  from  a restriction  on  preferences,  the 
choice  of  untaxed  good  becomes  relevant  in  that  it  may  make  more  sense  to 
assume  no  cross-effects  among  taxed  goods  if  labor  is  numeraire  and  the  other 
goods  are  commodities,  than  to  do  so  if  one  of  the  commodities  serves  as  the 
untaxed  good. 

The  inverse  elasticity  rule  of  (5.24)  is  expressed  in  terms  of  compensated 
elasticities.  Yet  in  various  places  in  the  literature  [Diamond  and  Mirrlees  (1971) 
and  Bradford  and  Rosen  (1976)],  it  is  expressed  in  terms  of  uncompensated 
elasticities.  This  is  the  result  neither  of  a revision  of  demand  theory  nor  an 
assumption  of  zero-income  effects.  Rather,  it  comes  about  because  of  a different, 
and  equally  arbitrary,  restriction  on  preferences.  We  can  express  the  optimal  tax 
formulae  in  terms  of  ordinary  uncompensated  demands  by  rearranging  (5.4), 


-Ztp- 

j 1 dpi 


(5.25) 


which,  assuming  dxj/dp:  = 0 unless  / = 0 or  j,  yields 


6,  ~ 


(5.26) 


where  tj„  = — ( pj/xi){'dxi/‘dpi)  is  the  uncompensated  own-elasticity  of  demand 
for  good  i.  Expressions  (5.26)  and  (5.24)  differ  because  they  result  from  different 
restrictions  on  the  structure  of  preferences:  different  matrices  are  being  assumed 
diagonal. 


5.2.  Minimizing  excess  burden  through  optimal  taxation 

By  its  definition,  excess  burden  ought  to  be  minimized  when  taxes  are  chosen  to 
maximize  utility.  However,  even  for  the  fixed  producer  price  case,  we  have  at  least 
two  candidates  for  measuring  excess  burden,  and  they  will  generally  take  on 
different  values.  It  turns  out  that  only  one  of  these,  that  based  on  the  equivalent 
variation,  satisfies  the  desirable  duality  property  of  being  minimized  by  optimal 
taxes  [Kay  (1980)]. 
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Recall  from  (2.9)  that  the  equivalent  variation  measure  of  the  excess  burden  of 
tax  is 

EB^=  E(p1,V(pl,  y))  - E(p0,V(pl,  y))  - R 


= y-E(p0,V(p,y))-R.  (5.27) 

Thus,  minimizing  this  for  a given  value  of  R amounts  to  maximizing 
E(p0,  V(pv  >’)).  But,  for  a given  price  vector,  expenditure  increases  monotoni- 
cally  with  the  level  of  utility.  Thus,  we  are  maximizing  V(pv  y),  just  as  in  the 
optimal  tax  problem.  This  is  easily  verified  by  differentiating  the  Lagrangian 

E{p0,V(pl,y))  + 'n(<R-(pl-pQ)-x),  (5.28) 

with  respect  to  pv 

For  the  compensating  variation  measure,  which  [from  (2.10)]  equals 

EBC=  E(puV(p0,  y))  - E(p0,V(p0,  y))  - R 

= E(Pl,V(Po,y))-y-R,  (5.29) 

minimizing  excess  burden  amounts  to  minimizing  E(py,  V(p0,  y)):  choosing 
taxes  to  minimize  the  expenditure  necessary  to  achieve  the  pre-tax  utility  level. 
This  need  not  be  the  same  price  vector  as  the  one  dictated  by  optimal  taxation. 
The  appropriate  Lagrangian  here  is 

E(Pi,V(Po,y))-tT(R-(Pl - Po)-x )>  (5.30) 

which  yields  first-order  conditions 


= 0,  (5.31) 

which  looks  like  the  one  derived  from  (5.28).  However,  the  value  of  x here  is  at 
the  hypothetical  point  at  higher  prices  but  with  compensation.  In  the  previous 
case,  it  is  at  the  actual  optimal  tax  point.13 

UA  fortiori,  it  can  be  seen  that  replacing  p0  with  any  arbitrary  “reference  price  vector”  p$  in  the 
expenditure  function  in  (5.27),  to  define  a different  concept  of  excess  burden,  i.e., 

EB£  = E(  Pl,  V(  px,  y ))-E(p$,  V(Pl,y))  -R, 

would  also  yield  a measure  consistent  with  the  optima]  tax  problem  [King  (1983b)]. 


-X;  + m 


E0 


(iXj 

3 Pi 


+ Xi 
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This  problem  with  the  compensating  variation  also  means  that  we  cannot 
compare  two  hypothetical  alternatives  to  a given  tax  situation  by  comparing  their 
marginal  excess  burden  measures.  Only  if  preferences  are  homothetic  [Chipman 
and  Moore  (1980)]  will  this  problem  dissappear.  Of  course,  for  pairwise  compari- 
sons, where  the  “initial”  point  is  not  well-defined,  the  equivalent  variation  and 
compensating  variation  are  symmetrically  defined,  so  there  can  be  no  a priori 
benefit  of  using  one  versus  the  other. 


5.3.  Changing  producer  prices 

The  simple  relaxation  of  the  fixed  producer  price  assumption  has,  perhaps 
surprisingly,  no  effect  at  all  on  the  optimal  tax  formulae  in  (5.18)  as  long  as 
producer  prices  result  from  competitive  behavior  and  any  pure  profits  are  taxed 
away  by  the  government. 

In  place  of  the  fixed  producer  price  assumption  of  Section  5.1,  we  assume  that 
production  is  governed  by  the  production  function 

/(*)-  0,  (5.32) 

where,  as  in  Section  3,  z is  the  production  vector  in  the  commodities  0,1,...,  A7. 
By  the  assumption  of  competitive  behavior,  we  know  that  the  producer  prices  q 
are  proportional  to  the  vector  of  derivatives  of  /,  df=(f0,f1,...,fN).  Without 
any  loss  of  generality,  we  may  set  this  proportionality  constant  equal  to  l//0  and, 
as  before,  choose  good  zero  as  numeraire,  i.e.,  q0=  1. 

The  government’s  revenue  requirement  must  now  be  specified  in  terms  of 
individual  commodities  (as  was  the  case  of  the  compensation  vector  in  Section  3), 
since  relative  producer  prices  can  change.  We  refer  to  this  as  the  revenue  vector, 
R . Thus,  z = x + R,  where  x is  the  household’s  vector  of  net  purchases. 

Once  production  has  been  generalized  to  this  stage,  the  possibility  arises  of 
pure  profits  coming  from  decreasing  returns  to  scale.  We  will  consider  this  more 
general  case  after  first  solving  the  optimal  tax  problem  when  /(•)  embodies 
constant  returns  to  scale,  i.e.,  is  homogeneous  of  degree  zero  in  all  commodities. 
By  Euler’s  Theorem,  profits  are  q ■ z = 0.  Thus,  the  government’s  optimization 
problem  becomes 

ma \V(p)  subject  to  f(x  + R)  = 0,  (5.33) 

p 

where,  because  pure  profits  are  zero,  we  can  set  p0  — q0  — 1 without  any  loss  of 
generality,  and  choose  only  pv...,pN.  To  use  p rather  than  t as  the  control 
variables,  we  must  insure  that  arbitrary  changes  in  p can  be  brought  about  by 
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changes  in  t.  This  is  accomplished  by  noting  that 

dp  = dt+dq  = dt  + d(d /)  = dr  + H(dx  + d R),  (5.34) 

where  H is  the  Hessian  d 2f  of  the  production  function,  as  before.  Since  d R = 0 
and  dx  may  be  characterized  by  the  Slutsky  equation,  we  have 

dp  = dt  + H^S-j^x,]jdp,  (5.35) 


or 


dp  = 


\I~H\S 


dr, 


where  S is  the  Slutsky  matrix.  Moreover,  since  the  changes  in  t are  constrained  to 
keep  revenue  constant,  and  hence,  in  the  neighborhood  of  the  optimum,  utility  as 
well,  the  changes  in  x are  compensated  and  (5.35)  simplifies  to 

dp=  [I- HS]~ldt  = i2dt.  (5.36) 

As  long  as  i2  exists  (i.e.,  [I  ~ HS]  is  of  full  rank),  we  may  control  r indirectly 
through  p. 

The  Lagrangian  corresponding  to  (5.33)  yields  the  first-order  conditions 
3jc  • 

-Xxi-pZfjjr  = 0,  1 = 1,...,  AT,  (5.37) 


where  X = dV/dy  and  p is  the  Lagrange  multiplier  on  the  production  constraint. 
Since  p ■ x = 0, 


v-  9x/ 

^9^  + *'  = 0- 


Using  this  and  the  fact  that  q = d/,  we  may  express  (5.37)  as 


-Ax,  + ju 


V 9x/ 

Z‘j9f  + Xi 

j 


= 0, 


(5.38) 


(5.39) 


which  is  precisely  condition  (5.4).  This  result  is  due  to  Diamond  and  Mirrlees 
(1971). 
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In  the  more  general  case  where  /(•)  is  not  homogeneous  of  degree  zero,  there 
may  be  pure  profits,  y = q ■ z > 0.  In  this  case,  we  know  from  before,  equal  taxes 
on  all  commodities  amount  to  a profits  tax  on  y,  giving  us  N + 1 rather  than  N 
independent  instruments.  Hence,  if  we  cannot  tax  one  good,  this  represents  a 
restriction  unless  we  can  tax  profits  directly.  For  expositional  purposes,  it  is 
easiest  to  let  the  N + 1 instruments  be  the  taxes  on  goods  l,...,  N and  the  profits 
tax,  keeping  t0  = 0.  We  let  t be  the  rate  of  profits  tax.  The  Lagrangian  now  is 

V(p,(l-T)y)~ii/(x  + R).  (5.40) 


Using  the  fact  that  p ■ x = (1  - r)y,  we  may  arrange  the  N first-order  conditions 
with  respect  to  the  taxes  tv...,tN  to  be 


XX:  + X (l  — T ) j ^ + fl 
d Pi 


v-  d*7  /,  \ d y 

+ - t)-z — 

iJ*Pi  d pi 


= 0. 


(5.41) 


It  is  straightforward  to  show  that  if  r may  be  freely  varied,  then  the  N + 1 
first-order  conditions  are  solved  for  t = 0 and  X = p:  no  excess  burden,  with 
profits  taxes  being  used  to  raise  all  revenue.  However,  if  r is  constrained,  we  must 
solve  the  N conditions  (5.41),  given  r.  Unless  profits  taxes  just  happen  to  equal 
q R,  we  again  face  an  optimal  tax  problem. 

If  r = 1,  so  that  all  profits  are  taxed  away,  then  (5.41)  reduces  to  the  previous 
optimal  tax  program,  (5.39).  Thus,  pure  profits  do  not  change  the  picture  unless 
they  accrue  at  least  partially  to  the  household  [Stiglitz  and  Dasgupta  (1971)].  If  t 
is  fixed  at  some  value  not  equal  to  one,  the  formulas  differ. 

Since  producer  prices,  and  hence  profits,  change  with  p,  the  derivatives 
dXj/dpj  in  (5.41)  include  the  indirect  effect  of  p,  on  profits  through  changes  in 
production, 


Axj 

d Pi 


9/>,  dy'  dPi 


(5.42) 


where  y’  — (1  — r)y.  Using  (5.42),  the  Slutsky  equation,  and  the  definition  of  a, 
the  social  marginal  utility  of  income,  from  (5.5),  we  may  rewrite  (5.41)  as 


£s,a-  - 


(5.43) 


which  differs  from  (5.6)  only  through  the  replacement  of  x,  with 
(x,  — (1  - T)(dy/d/>;).  One  can  interpret  these  terms  as  the  net  increase  in 
resources  needed  to  maintain  a given  level  of  utility  with  respect  to  an  increase  in 
pt  in  the  two  respective  cases. 
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If  the  profits  tax  r = 0,  and  if  good  zero  is  the  single  production  factor  and  the 
sole  good  from  which  revenue  is  extracted,  then  one  can  show  that  (5.43)  yields 
the  result  obtained  above  for  fixed  producer  prices,  that  to  a first-order  Taylor 
approximation,  substituting  optimal  taxes  for  lump  sum  taxes  causes  an  equipro- 
portional  reduction  in  the  output  of  all  taxed  commodities.  A fortiori , the 
outcome  also  holds  for  the  constant  returns  case  just  examined.  This  result  is  due 
to  Stiglitz  and  Dasgupta  (1971),  who  in  turn  attribute  it  to  Ramsey  (1927),  though 
the  exact  equivalence  is  obscured  by  differences  in  methodology. 

The  key  to  the  single-factor  assumption  is  that,  since  the  production  function 
may  be  written 

f(x)=f(x1,...,xN)-x0,  (5.44) 

the  Hessian  H = d2/  is  block  diagonal  in  the  untaxed  good  and  all  other  goods 
(Hi0  = H0i  = 0 for  i =£  0).  Thus,  the  product  of  H and  the  substitution  matrix  S is 


HS  = 


/ Hqq  | 0 \ / Sqq  | So'  \ _ / tfooSoo  | "00%  \ 

j ~ \ ASoJ-'M )’ 


(5.45) 


where  S0'  = (S01, . . . , S0N)  and  H and  S are  the  blocks  of  H and  S for  goods  1 
through  N.  This  means  that  the  changes  in  consumer  prices  of  the  taxed  goods, 
p = (px,...,  pN),  can  be  expressed  [using  (5.36)]  in  the  neighborhood  of  the 
optimum  as 

dp=  [I  - HSy1di=ttdt,  (5.46) 


where  t = (tv tN).  That  is,  dp  does  not  depend  on  the  demand  for  x0.  From 
(5.46),  we  may  express  the  first-order  change  around  f = 0 in  Jc,  the  vector  of 
taxed  goods,  as 

A Jr  = SAp  = SflAf  = SQi=  S&S-lSi.  (5.47) 

The  elements  of  the  vector  St  are  described  in  (5.43).  By  the  envelope  theorem 
and  the  fact  that  q0  = 1,  we  may  solve  for  the  term  dy/dp,, 


Al  = yz  l 

d Pi  ^ J dp, 


L*jT.»jkskl-Y.*jY.nJkskl 

j>  0 k k>0 


(5.48) 


where  the  last  step  relies  on  the  assumption  that  HJ0  = 0 for  j ± 0.  Stacking  these 
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d y 
dp 


= SHz, 


(5.49) 


where  £ = (zl5 . . . , zN).  But  by  assumption,  all  revenue  is  spent  on  good  zero,  so 
z = x.  Since,  also  by  assumption,  t = 0,  it  follows  from  (5.43)  that 


MV)*'-*1* 

Substituting  (5.50)  into  (5.47),  we  obtain 
Ajc=  SfoS-'il- SH)x 


(5.50) 


(5.51) 


as  required. 

In  the  special  case  where  both  H and  S are  diagonal  (i.e.,  there  is  no  joint 
production  and  commodity  demands  are  independent  except  with  relation  to  the 
numeraire),  the  expression  (5.49)  for  dy/dp  simplifies  to 


d y 

d Pi 


ZiHjjSjj, 


(5.52) 


which,  if  we  again  assume  that  all  revenue  raised  is  spent  on  the  numeraire 
(z,  = Xj  for  i > 0),  allows  us  to  rewrite  (5.43)  as 


■Sut,' 


V- 


-)(l-(l-r)tfHSH), 


(5.53) 


or 


61,= 


ju  — a 


1 x 1 

— + (1  - t)  — 


1+(1-t)-J-) 

aul 


where  e„  = — Slt( pt/x2),  au  = (l/Hu)(ql/xi)  and  0i=ti/pl  are  the  demand  and 


100 


A lan  J.  A uerbach 


supply  elasticities  and  ad  valorem  tax  for  good  i.  [See  Stiglitz  and  Dasgupta 
(1971)  for  a slightly  different  formulation.  Also  see  Atkinson  and  Stiglitz  (1980).] 

5.4.  Production  efficiency 

Thus  far,  we  have  assumed  production  to  be  efficient,  with  the  only  distortions 
imposed  by  taxes  to  be  with  respect  to  household  decisions.  However,  government 
can  induce  distortions  in  production,  either  through  differential  taxation  of 
factors  in  different  uses  or  through  the  use  of  different  shadow  prices  in  public 
enterprises  than  those  generated  by  coexisting  competitive  private  markets.  Should 
these  extra  policy  instruments  be  used?  Under  certain  well-defined  conditions, 
they  should  not. 

To  consider  the  desirability  of  such  distortions,  we  follow  Diamond  and 
Mirrlees  (1971)  and  suppose  there  to  be  two  production  sectors,  each  efficient  in 
its  own  production  behavior.  We  shall  refer  to  these  as  the  private  and  public 
sectors,  though  in  some  cases  it  may  be  more  useful  to  think  of  them  both  as 
subsectors  of  the  private  sector.  The  results  are  easily  extended  to  several  sectors. 

As  before,  we  let  /(•)  and  z be  the  production  function  and  output  of  the 
private  sector,  and  introduce  g(-)  and  s as  the  corresponding  variables  for  the 
public  sector.  The  use  of  distortions  in  the  allocation  of  resources  between  the  two 
sectors  may  be  thought  of  as  the  direct  choice  of  public  inputs,  s.  Thus,  the 
government’s  expanded  choice  problem  is 

max  V(p,{\  - r)y)  subject  to  f(x+R-s)  = 0 and  g(s)  = 0, 

p.s 

(5.54) 

where  y is  private  sector  profits.  Attaching  the  Lagrange  multipliers  ft  and  f to 
the  production  constraints,  we  obtain  the  same  first-order  conditions  as  before 
with  respect  to  p.  With  respect  to  s,  we  get 


y-  , dxj  , _ , d£ 

ds,  T d‘s- 


Using  the  normalization  q = df  and  the  consumer’s  budget  constraint,  we  rewrite 
this  as 


_ dx. 


(5.56) 
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or 


where,  as  before,  a = X + n(dR/dy)  is  the  social  marginal  utility  of  income. 
Thus,  there  are  two  important  cases  in  which  efficient  overall  production  (/)//•  = 
gj/gj ) will  result:  constant  returns  to  scale  in  the  private  sector  [Diamond  and 
Mirrlees  (1971)]  and  decreasing  returns  with  100  percent  profits  taxation  [Stiglitz 
and  Dasgupta  (1971)].  Otherwise,  inefficient  production  will  be  part  of  the 
optimal  solution.  The  basic  intuition  is  that  as  long  as  we  can  tax  all  but  one  of 
the  commodities,  we  can  bring  about  any  possible  configuration  of  relative  prices 
consistent  with  a given  level  of  revenue.  When  after-tax  profits  (1  — T)y  equal 
zero,  these  prices  are  the  sole  determinants  of  the  consumer’s  decision.  Thus,  any 
attainment  of  a set  of  relative  prices  using  a production  distortion  could  also  be 
obtained  without  one,  with  the  simple  result  that  the  consumer  could  be  made 
better  off.  Note  that  this  logic  only  holds  if  all  the  taxes  /,  through  tN  can  be 
adjusted.  With  some  of  these  held  fixed,  production  inefficiencies  may  be  helpful 
in  imposing  indirect  taxes  on  the  goods  that  cannot  be  freely  taxed  directly.  We 
return  to  this  point  below  in  our  discussion  of  tax  reform. 

For  the  case  where  profits  are  not  zero,  we  may  simplify  (5.56)  for  the  case  of 
independent  production.  Considering  dy/ds jy  we  have  (using  the  envelope  theo- 
rem and  independence  assumption) 


dy 

dSj 


d 9x, 


dy' 


d y 
ds. 


(5.57) 


which,  using  the  facts  that  q = df  and  dq  = H,  and  the  assumption  that  all 
government  expenditures  are  on  the  numeraire  commodity  (ic  = z),  we  may  solve 
as 


dy x,Hu 

ds‘  I- (1  -r^XjH^dxj/dy') 


r ’ 


(5.58) 


where  o„  is  the  supply  elasticity  for  good  /',  and  F must  be  positive  for  a stable 
solution.  Thus  (5.56)  yields 


gL  = fl  ( 1 + k/a„  \ 

Xj  fj  \l+k/°jj) 


where  k > 0. 


(5.59) 
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6.  Optimal  taxation  and  the  structure  for  preferences 

This  section  considers  the  implications  of  the  tax  formulae  derived  above  for 
actual  tax  rates  under  different  assumptions  about  the  structure  of  preferences, 
and  for  the  more  general  case  where  there  are  several  individuals  and  hence 
distributional  objectives  to  be  satisfied.  Although  the  results  already  presented 
expressed  the  optimal  taxes  in  terms  of  the  demands  and  substitution  matrix  of 
the  representative  consumer,  these  terms  are  not  generally  constant,  so  we  have 
little  insight  into  the  general  conditions  on  consumer  preferences  required  for 
either  uniform  taxation  or  any  other  specific  tax  structure  to  be  optimal.  In 
exploring  this  question,  we  will  also  be  able  to  investigate  more  easily  the  impact 
of  distributional  objectives  on  the  optimal  tax  structure. 


6.1.  Optimal  taxation  from  the  dual  perspective 

To  consider  the  role  of  preferences  in  determining  optimal  tax  rules,  it  is  helpful 
to  derive  such  rules  using  the  direct  utility  function  rather  than  the  indirect  utility 
function.  Though  the  derivation  is  less  straightforward,  the  results  are  in  terms  of 
the  characteristics  of  the  utility  function  and,  hence,  preferences.  This  approach  is 
taken  by  Atkinson  and  Stiglitz  (1972,  1976,  1980).  However,  a simpler  and  more 
elegant  way  of  arriving  at  their  results  is  by  transforming  the  optimal  tax 
formulae  themselves  using  duality  theory.  The  technique  described  by  Deaton 
(1979a,  1981a,  1981b)  makes  use  of  the  “distance”  function,  sometimes  referred 
to  as  the  “direct”  expenditure  function  [Cooter  (1979)].  Our  analysis  here  will 
generally  follow  that  of  Deaton.  Because  consumer  preferences  are  defined  with 
respect  to  consumption,  rather  than  purchases,  it  is  useful  to  separate  these 
concepts  by  letting  the  vector  of  purchases  x equal  x — x where  x is  the 
consumption  vector  and  x the  endowment  vector.  Thus,  we  may  rewrite  the 
indirect  utility  function  V( p),  which  implicitly  holds  x as  fixed,  as  V(p,p  x), 
which  does  not.  This  allows  us  to  consider  the  effects  of  changes  in  the  consumer’s 
lump  sum  income. 

In  words,  the  distance  function  is  the  solution  to  the  following  problem: 
consider  a consumption  bundle  x,  and  also  all  the  combinations  of  price  vector  p 
and  total  endowment  income  y such  that  V(p,  y)  equals  (strictly  speaking,  at 
most  equals)  some  constant  utility  level  U.  Choose  the  vector  of  prices  that 
minimizes  p*  ■ x/y,  given  x.  The  resulting  value  is  the  distance  function  D(x,  U). 
Algebraically,  the  problem  is 


vnin{p*-x)/y  subject  to  V(p*,y)<U. 
r 


(6.1) 
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It  is  explained  diagrammatically  in  Figure  6.1,  for  the  case  of  two  goods.  For 
simplicity,  we  assume  that  x is  on  the  indifference  curve  corresponding  to  the 
utility  level  U,  although  only  the  scale  of  £>(•)  and  not  the  price  vector  chosen 
would  be  affected  by  increasing  or  decreasing  x along  the  ray  shown.  This  is 
easily  verified  from  inspection  of  (6.1),  since  minimizing  (/>*,  x)/y  is  equivalent 
to  minimizing) />*  • \x )/y  for  any  X > 0.  By  choosing  x to  be  just  feasible,  given 
U,  we  will  obtain  a value  D(x,  U)  = 1. 

The  figure  depicts  two  different  combinations  of  p*  and  y,  indexed  1 and  2, 
that  satisfy  V(p*,y)=  U.  Since  the  price  vector  p*  results  in  a tangency  away 
from  x , purchase  of  x would  require  a greater  expenditure  than  y2.  This  is  not 
the  case  with  pf,  since  it  is  tangent  to  the  indifference  curve  at  x.  (A  flatter 
budget  line  would  again  necessitate  an  increase  in  expenditure  to  purchase  x.) 
Thus,  the  price  vector  chosen,  given  x and  U,  is  tangent  to  the  indifference  curve 
corresponding  to  U at  point  x (or,  more  generally,  if  x is  not  on  the  indifference 
curve,  at  the  point  on  the  indifference  curve  on  the  ray  through  x from  the 


Figure  6.1.  The  distance  function. 
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origin).  Just  as  the  indirect  expenditure  function  chooses  consumption,  given 
prices  and  utility,  the  distance  function  chooses  normalized  prices,  given  con- 
sumption and  utility.  Since  these  prices  are  based  on  the  consumer’s  indirect 
utility  function,  we  may  interpret  them  as  points  on  the  consumer’s  inverse 
compensated  demand  curve,  expressing  willingness  to  pay.  By  the  envelope 
theorem,  the  partial  derivatives  of  the  distance  function  with  respect  to  the 
elements  of  x are  those  normalized  inverse  demands: 

||  = af(x,f7  ) = ^-.  (6.2) 

The  Hessian  of  the  distance  function  is  referred  to  as  the  Antonelli  matrix 

A=(aij).14 

Now,  consider  the  actual  price  vector  that  prevails,  p,  and  choose  x such  that 
jc  = xc(p,  U).  Then  by  construction,  p*  =p  and  y = E(p , U)  solve  (6.1),  and  we 
have  the  identity  [from  (6.2)] 

= C.3) 

Multiplying  (6.3)  through  by  E(p,U),  and  differentiating  with  respect  to  each 
price,  we  obtain  conditions  which  can  be  stacked  to  yield 

E(p,U)AS  = I — axc  ( P,U ),  (6.4) 

where  a = (a0, aN).  Evaluating  at  U = V(p,  p ■ x),  this  yields 


(p  ■ x)AS  = I - ax(p,  p ■ x).  (6.5) 

Now,  let  us  return  to  the  optimal  tax  results  described  in  Section  5.  Multiplying 
both  sides  of  (5.15)  by  (p  ■ x)A,  and  using  the  fact  that  since  a is  homogeneous 
of  degree  zero  with  respect  to  Jc,  Ax  = 0,  we  obtain 

t = a(x  + x)'t  — | — — a j ( p ■ Jc)^(x  - x) 

= a( R + t ■ x)  + | j(/> ' x)Ax,  (6.6) 

14See  Deaton  (1979a)  for  further  discussion  of  the  properties  of  the  function  O(-)  and  the 
matrix  A. 
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where  R = t ■ x is  tax  revenue.  Using  the  fact  that  t0  = 0 to  eliminate  (ju  — a)/ju, 
we  obtain  [Deaton  (1981b)] 


R + t-x\(  (Ax),  Oq\ 

px  (Ax)0'a,j 


Yi,xjd\nai/'dxj  ' 

_J_ 

Y,xjd\na0/'dxj 


which,  in  turn,  implies  that,  for  any  i and  j. 


3 In  (a  /a,) 


(6.7) 


(6.8) 


where  v'  = u/(Y.Xj d In a0/d3c,).  From  (6.8),  we  see  that  a sufficient  condition  for 
the  taxes  to  be  the  same  is  that  the  ratio  of  marginal  valuations  (a ,/a,)  be 
independent  of  the  consumption  of  commodities  in  which  the  consumer  has  an 
endowment.  This  is  equivalent  to  the  distance  function  being  separable,  or 
capable  of  being  expressed  as 

£>(*,£/)=/(*„  x2,U,<t>{x3,V)),  (6.9) 


where  xx  are  the  commodities  in  which  there  is  an  endowment  and  x3  are  the 
goods  on  which  taxes  are  uniform.15  It  also  follows  that  the  normal  or  indirect 
expenditure  function  is  separable  in  the  corresponding  prices  [Gorman  (1976)]. 
This  separability  of  the  expenditure  function  is  referred  to  as  implicit  separability 
and  differs  from  the  separability  of  the  direct  and  indirect  utility  functions.16 
Indeed,  they  are  the  same  only  if  the  utility  function  is  homogeneous  in  x3  as  well 
[Deaton  (1981a)],  and  it  is  easy  to  construct  counter-examples  for  the  case  where 
preferences  are  just  weakly  separable  [Auerbach  (1979a)]. 

In  the  special  case  where  the  consumer’s  only  endowment  is  in  the  numeraire 
commodity,  good  zero  (presumably  leisure),  the  sufficient  (and  now  necessary,  as 
well)  condition  for  uniform  taxation  of  commodities  is  implicit  separability  from 
leisure.  It  is  also  possible  in  this  case  to  say  more  about  which  goods  will  be  taxed 
more  heavily  if  weak  separability  but  not  homogenity  is  satisfied.  We  begin  by 


15 Because  D(  ) is  homogeneous  of  degree  1 in  x,  / must  be  homogeneous  of  degree  1 in  xt,  x2 
and  <J>,  and  <!>  homogeneous  of  degree  1 in  iv 

16(Weak)  separability  of  the  direct  utility  function,  for  example,  would  allow  the  utility  function 
U(x)  to  be  written  f(xl,x2,^>(x3)),  i.e.,  the  marginal  rate  of  substitution  between  elements  of  *3  is 
independent  of  the  levels  of  x1  and  x2  ■ 
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rewriting  (6.8)  as 


e,  - 0, 


31n  (aj/a,) 


(6.10) 


where  v'  = v/{x^a^/a^\ 

By  the  convexity  of  D(-),  v'  has  the  opposite  sign  of  v and  hence  is  negative. 
Since  aJ/ai=pJ/p,=  Uj/U„ 

din (Uj/Uj)  din (ay/af)  31n (a/a,)  | 31n (a/a,-)  d U n 

dx0  dJc0  3x0  W dx0  ^ 


[Deaton  (1981a)].  Thus,  when  utility  is  separable  into  goods  and  leisure,  (6.10) 
becomes 


e-0j=~v 


_ 31n (aj/aj)  d U 


3 U 


(6.12) 


so  that  taxes  will  be  higher  on  those  goods  that  are  necessities,  if  these  are  defined 
by  those  whose  valuation  by  the  consumer  declines  relatively  with  an  increase  in 
real  income.  This  is  particularly  important  if  we  use  empirical  demand  estimates 
based  on  restricted  functional  forms  to  estimate  optimal  taxes.  For  example,  the 
linear  expenditure  system 


xi{p,p0x0)  = ci  + 


(6.13) 


often  used  in  empirical  work,  comes  from  the  Stone-Geary  utility  function 


i 


(6.14) 


which  is  strongly  separable,  but  not  homogeneous  unless  the  terms  a t equal  zero 
(in  which  case  it  is  simply  Cobb -Douglas). 
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Once  we  allow  for  the  presence  of  several  individuals  with  different  tastes  or 
income,  distributional  considerations  become  an  issue. 17  As  stressed  in  Section  3, 
these  considerations  must  be  represented  by  the  specification  of  an  explicit  social 
welfare  function  based  on  individual  utilities.  This  cannot  normally  be  achieved 
by  the  direct  choice  of  distributional  weights  on  individual  income  unless  the 
weights  are  allowed  to  change  with  prices  in  a complicated  fashion.  There  are  two 
problems  we  consider  in  this  subsection.  First,  when  and  how  are  the  previously 
derived  optimal  tax  rules  influenced  by  equity  considerations?  Second,  if  we 
choose  leisure  as  numeraire  and  admit  lump  sum  taxes  that  cannot  vary  across 
individuals,  when  will  uniform  commodity  taxes  be  optimal? 

We  begin  by  specifying  a social  welfare  function  of  the  form 


W=  W(U\...,UH), 


(6.15) 


which,  maximized  subject  to  the  usual  revenue  constraint  under  the  assumption  of 
zero  profits  in  the  private  sector,  yields  the  following  N first-order  conditions  for 
optimal  commodity  taxes  t = (tv...,  tN): 


j * 


9x* 

dPi 


+ Xi 


= 0, 


/ = N, 


(6.16) 


where  Wh  = dW/dUh,  Xh  = dUh/dyh  and  x,  = X <hxf.  Defining  ah,  as  before,  to 
be  the  social  marginal  utility  of  individual  h ’s  income, 


ah  = Wh\h  + P 


d R 
d yh’ 


(6.17) 


we  may  express  the  conditions  (6.16)  as 


where  S:J  — lLhS*  and 


~ v-1  X‘ 

a,  = L,  — ® 

x. 


xt. 


i = l,...,N, 


(6.18) 


(6.19) 


is  the  average  value  of  a,  weighted  by  individual  consumption  shares  of  good  i. 


17Indeed,  even  it  all  individuals  are  identical,  the  optimal  tax  system  need  not  dictate  identical 
treatment.  This  is  discussed  in  Section  7. 
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This  neat  formulation  [due  to  Diamond  (1975)]  shows  that  the  “equal  propor- 
tional reduction”  rule  is  amended  to  call  for  a greater  proportional  reduction  in 
the  purchase  of  commodities  for  which  a,  is  small.  The  implication  of  this  result  is 
more  clearly  seen  if  we  note  [following  Feldstein  (1972)]  that 


a,  = cov 


±i_ 

x , 


(6.20) 


so  that  a,  exceeds  the  unweighted  mean  of  ah  if  and  only  if  purchases  of 
commodity  i are  positively  correlated  with  a over  individuals.  Normally,  this 
would  define  a necessary  good,  whose  budget  shares  fall  with  income  and  hence 
rise  with  a.  Note,  however,  that  (6.18)  applies  to  proportional  reductions  in 
purchases  of  different  commodities,  and  does  not  offer  an  explicit  solution  for 
individual  tax  rates,  unless  we  assume  aggregate  commodity  demands  to  be 
independent  (SL  = 0 for  i =£  j).  This  yields 


eJi. 

M - «, 

Eii 

M — &J 

(6.21) 


which  says  that  the  normal  inverse  elasticity  rule  is  changed  by  the  addition  of  a 
second  term  expressing  distributional  concerns.  Note  that  as  marginal  excess 
burden,  and  hence  the  size  of  g relative  to  a,  increases,  efficiency  considerations 
come  to  dominate  these  optimal  tax  rules  [Feldstein  (1972)]. 

The  addition  of  the  possibility  of  lump  sum  taxation  increases  the  generality  of 
the  problem  without  much  additional  complexity.  If  individuals  have  one  source 
of  income,  then  the  combination  of  N commodity  taxes  and  a lump  sum  tax  may 
be  thought  of  as  a linear  income  tax  plus  N — 1 additional  commodity  taxes.  The 
ability  to  use  lump  sum  taxation  simply  adds  a constant  tax  term  T to  each 
consumer’s  indirect  utility  function  and  a term  HT  to  the  revenue  constraint. 
Differentiating  the  expanded  Lagrangian  with  respect  to  T,  we  obtain  the  ad- 
ditional first-order  condition 


- I Wh\h  + n 


^ ^ dx * 

y (T  — - 

/ h OJ 


= o, 


(6.22) 


to  be  added  to  the  N conditions  in  (6.16).  This  new  condition  simplifies  to 


£«"  = «. 


h 


(6.23) 
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X, 


(6.24) 


Now,  there  should  be  reductions  in  commodity  purchases  only  to  the  extent  that 
the  good  in  question  is  consumed  relatively  more  by  people  with  low  values  of  a; 
purchases  of  some  goods  will  increase.  With  equal  distributional  weights,  ah,  each 
of  these  reductions  would  be  zero,  and  hence  pure  lump  sum  taxation  would  be 
optimal. 

An  interesting  question  to  ask  here  is  under  what  conditions  proportional  taxes 
6 = (ti/Pi,  ■■■,  tN/pN)  will  be  equal?  In  other  words,  since  such  uniform  taxes  are 
equivalent  to  a single,  proportional  tax  on  the  numeraire,  labor,  when  is  a linear 
income  tax  optimal?  A sufficient  condition  [Deaton  (1979b)]  is  that  each  individ- 
ual h have  a utility  function  weakly  separable  into  goods  and  leisure,  with  the 
subfunction  in  goods  possessing  linear  Engel  curves  with  common  slopes  across 
individuals.  The  intuition  behind  this  result  is  that  the  restriction  on  goods  is  that 
preferences  obey  the  Gorman  polar  form  required  for  exact  aggregation  of 
commodity  demands.  If  we  can  perform  such  aggregation,  then  we  cannot  use 
differential  taxation  to  distinguish  among  individuals  for  purposes  of  redistribu- 
tion: a linear  income  tax  exhausts  our  capacity  in  this  regard. 

Note  the  similarity  of  this  result  to  that  of  the  case  of  non-linear  income 
taxation  [Atkinson  and  Stiglitz  (1976)],  where  weak  separability  alone  is  sufficient 
for  the  optimality  of  income  taxation.  There  is  a clear  relationship  here  between 
the  relaxation  of  the  restriction  on  the  linearity  of  taxes,  on  the  one  hand,  and 
that  of  the  linearity  of  preferences,  on  the  other. 

Empirical  studies  of  optimal  taxation  are  not  very  common,  perhaps  because 
the  information  needed  concerning  various  cross-substitution  terms  is  difficult  to 
obtain  without  a restriction  on  preferences  that  prejudges  the  result.  Two  studies, 
by  Atkinson  and  Stiglitz  (1972)  and  Deaton  (1977),  utilize  the  linear  expenditure 
system,  which  calls  for  higher  taxes  on  necessities  in  the  single-consumer  case  (as 
discussed  above)  and,  in  the  multi-consumer  case  with  lump  sum  taxes  available, 
calls  for  no  differential  commodity  taxes  at  all,  since  the  Gorman  conditions  are 
satisfied.  Nevertheless,  these  calculations  are  still  instructive.  Deaton,  for  exam- 
ple, calculates  the  optimal  taxes  on  commodities  under  the  assumption  that  labor 
is  fixed  and  there  are  no  lump  sum  taxes.  Obviously,  with  fixed  labor  supply, 
uniform  taxes  on  commodities  are  non-distortionary,  but  may  have  undesirable 
distributional  effects.  For  a demand  system  estimated  for  the  U.K.,  he  calculated 
optimal  tax  rates  for  eight  groups  of  commodities  under  various  assumptions 
about  the  degree  of  inequality  in  the  social  welfare  function.  Perhaps  the  most 
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interesting  result  obtained  was  that  optimal  tax  rates  do  not  behave  monotoni- 
cally  with  respect  to  the  degree  of  inequality  aversion  implicit  in  the  social  welfare 
function. 

A recent  application  of  the  optimal  tax  results  in  the  context  of  developing 
countries  (India)  may  be  found  in  Heady  and  Mitra  (1982).  Still  another  ap- 
proach has  been  to  infer  from  an  existing  indirect  tax  structure  what  the 
government’s  preferences  would  have  to  be  for  the  structure  to  be  optimal 
[Christiansen  and  Jansen  (1978)  for  Norway,  Ahmad  and  Stern  (1981)  for  India]. 


7.  Further  topics  in  optimal  taxation 

There  are  a number  of  particular  problems  involving  taxation  generally  to  which 
optimal  tax  theory  has  been  applied.  This  section  presents  some  of  these. 


7.1.  Public  goods  provision 

The  classic  conditions  for  efficiency  in  the  provision  of  public  goods  were  derived 
by  Samuelson  (1954).  Aside  from  the  standard  requirement  that,  for  private 
(rival)  goods,  each  consumer’s  marginal  rate  of  substitution  between  two  goods 
should  equal  the  social  marginal  rate  of  transformation,  there  was  the  new 
condition  that,  between  a private  and  a public  good,  the  marginal  rate  of 
transformation  should  equal  the  sum  of  individual  marginal  rates  of  substitution. 
This  is  because  every  consumer  partakes  of  each  additional  unit  of  the  public 
good. 

Pigou  (1947)  argued  that  in  considering  the  benefits  of  a new  public  project,  the 
government  should  recognize  that  its  undertaking  may  require  the  introduction  of 
additional  deadweight  loss  through  the  tax  system.  The  implication  that  this 
increases  the  social  cost  of  public  goods  has  been  addressed  by  a number  of 
authors,  including  Diamond  and  Mirrlees  (1971),  Stiglitz  and  Dasgupta  (1971) 
and  Atkinson  and  Stern  (1974). 

Even  to  examine  the  question  of  public  goods,  we  must  allow  for  the  presence 
of  several  individuals.  Since  we  are  not  directly  interested  in  distributional  issues 
here,  we  assume  all  H individuals  to  be  identical  in  all  respects.  If  we  let  G be  a 
public  good  on  which  all  government  revenue  is  spent  and  which  all  consume, 
then  each  individual’s  indirect  utility  function  becomes 

V{p\ G)  — max  U(x;  G)  subject  to  p x = 0,  (7.1) 


with  dV/dG=dU/dG\x=Xc(p  G).  The  production  function  is  f(x;G)=  0.  The 
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government  maximizes  the  welfare  of  the  representative  individual  by  maximizing 
the  sum  of  individual  utilities,  since  all  individuals  are  the  same.  This  gives  rise  to 
the  Lagrangian 

L — HV(p;  G)  — nf(x\  G),  (7.2) 

with  first-order  conditions  with  respect  to  each  price  (except  that  of  the  untaxed 
numeraire) 


dx 


(7.3) 


where  A and  j a are  defined  in  the  usual  way.  As  in  Section  5.3,  we  use  the  fact 
that  p ■ xh  = 0 for  each  individual  h to  obtain 


Ax,.  + p 


X> 


0/>, 


+ x , 


= 0,  / = N, 


(7.4) 


where 


xt  = = Hxf. 

h 

As  before,  this  may  be  rewritten  as 


where  S is  the  aggregate  Slutsky  matrix  and  a is  the  social  marginal  utility  of 
each  individual’s  income. 

The  first-order  condition  with  respect  to  the  choice  of  public  good  G is 


HdG  M 


0, 


(7.6) 


which  yields  (since  A s dU/dx^,  q,  oc  /,,  q0  = 1 and  p ■ xh  = 0) 

v dU’ydG  n p\[fc  d R\ 
h dUh/dx0  UA/o  day 


where  R is  the  revenue  collected  (equal  to  the  public  goods  purchased,  in 
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equilibrium).  This  result  says  that  the  appropriate  social  cost  of  the  public  good  G 
in  terms  of  the  numeraire,  x0,  to  which  the  sum  of  marginal  rates  of  substitution 
should  be  set  equal,  differs  from  the  marginal  rate  of  transformation  fG/f0  for 
two  reasons.  First,  if  public  goods  are  complementary  to  taxed  goods,  increasing 
G may  reduce  excess  burden  by  increasing  consumption  of  taxed  goods,  making 
dR/dG  > 0 [Diamond  and  Mirrlees  (1971)].  The  other  term  jx/A,  equals  the  ratio 
of  the  marginal  disutility  of  raising  a dollar  of  revenue  divided  by  the  marginal 
utility  of  income,  and  exceeds  one  to  the  extent  that  an  increase  in  revenue 
increases  excess  burden.  This  corresponds  to  the  point  raised  by  Pigou.  However, 
it  need  not  be  the  case  that  ju./ A exceeds  one.  Again,  there  is  an  income  effect  at 
work. 

This  possibility  is  demonstrated  (following  Atkinson  and  Stern)  by  multiplying 
both  sides  of  (7.5)  by  the  vector  t to  obtain 


r'Sr=-(^)«,  (7.8) 

which,  by  the  negative  semi-definiteness  of  S,  implies  that  / a > a for  positive 
revenue.  But  a > A [see  equation  (5.5)]  only  if  dR/dy  is  positive.  If  taxed  goods 
are,  on  average  (weighted  by  tax  rates)  inferior,  dR/dy  < 0 and  A > a.  Hence,  A 
may  actually  exceed  /t,  meaning  that  raising  an  additional  dollar  to  pay  for  public 
goods  may  actually  lessen  excess  burden  by  causing  a shift  toward  the  consump- 
tion of  taxed  goods. 


7.2.  Externalities 

Referring  again  to  Pigou,  we  know  that  the  appropriate  response  by  the  govern- 
ment (under  conditions  of  perfect  information)  to  an  externality  is  the  imposition 
of  a tax  that  causes  producers  of  the  externality  to  internalize  the  additional  social 
cost  (or  benefit)  of  their  action.  Suppose,  however,  that  all  commodities,  including 
the  one  possessing  the  externality,  are  subject  to  distortionary  taxation.  How  is 
the  Pigouvian  prescription  affected?  Following  Sandmo  (1975),  we  assume  identi- 
cal individuals,  fixed  producer  prices  and  let  the  externality  be  a symmetric 
consumption  externality  related  to  total  consumption  of  good  N.  Thus,  individual 
utility  for  the  representative  individual  h is  U(xh',xN),  where  xN  = HxhN.  The 
partial  derivative  of  U with  respect  to  xN  may  be  positive  or  negative.  Assuming 
for  convenience  that  each  individual  takes  xN  as  given  (as  will  be  approximately 
true  for  H large),  we  may  express  the  corresponding  indirect  utility  function  as 
V(P\xn)i  parallel  to  the  public  good  example,  with  W/dxN  = W/dxN\ x (p._x  y 
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Maximizing  the  sum  of  utilities  with  respect  to  p subject  to  the  need  to  raise 
revenue  R through  distortionary  taxes  yields  the  N first-order  conditions 


v , ,,  dU  9 xN 

-Ax,  + H ~ h IX 

dxN  dp, 


V dxi 


0,  /=!,...,  N, 


(7.9) 


or 


— Ax,  4-  JU, 


9x 

x+  Yt* — - 

+ dPi 


= 0,  / = N, 


(7.10) 


where 

?,  = ?,*,  / = 1 , . . . , fV  — 1 , 

Equation  (7.10)  is  the  standard  optimal  tax  result,  but  it  applies  to  the  vector  t* 
rather  than  t.  The  difference  between  them  implies  that  the  optimal  tax  on  good 
N equals  that  dictated  by  the  standard  formula  plus  the  externality  imposed  by 
additional  consumption  of  the  good:  the  Pigouvian  tax.  Thus,  the  optimal  tax  and 
Pigouvian  taxes  are  separable,  in  a sense;  we  may  imagine  choosing  the  two 
independently.  However,  this  independence  is  only  present  analytically,  since  the 
actual  level  of  the  externality,  and  hence  the  Pigouvian  tax,  depends  on  the  actual 
equilibrium  and  hence  the  optimal  tax  rates;  the  same  is  true  in  the  other 
direction. 


7.3.  Pre-existing  distortions 

If  the  government  faces  pre-existing  distortions  (of  which  the  preceding  example 
of  externalities  is  a specific  kind),  it  may  wish  to  alter  its  choice  of  optimal  taxes. 
Following  Green  (1961),  let  us  assume  that  lump  sum  taxes  are  available,  but 
certain  prices  are  distorted  and  cannot  be  influenced  directly.  This  could  be  the 
result  of  non-competitive  behavior,  but  we  shall  assume  it  to  be  due  to  some  tax 
that  must  be  maintained,  perhaps  for  political  purposes.  Assuming  that  the 
representative  individual’s  only  lump  sum  income  is  from  the  government,  we 
have  the  problem 

ma.xV(p,—T)  subject  to  (p-q)-x+T—R, 

P*.T 


(7.11) 
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where  p*  is  the  subset  of  p that  may  be  adjusted.  Note  that  unless  at  least  two 
prices  are  fixed,  equiproportional,  non-distortionary  taxation  is  possible. 

Differentiating  the  Lagrangian  corresponding  to  (7.11)  with  respect  to  p,  and  T 
yields 


— Xxj  + fi 


x-  9 XJ 

Y,tx+Xi 

J ^ ' 


= 0,  V/>, .£;>*, 


— A + fi 


_ 3x, 


= 0, 


which  may  be  written  as 


(7.12a) 


(7.12b) 


EV;=  - 


jli  — a 
P 


x,. 


VPi^p* 


(7.13a) 


p = a, 


(7.13b) 


for  a defined  as  above.  These  conditions  are  quite  familiar,  and  yield  the 
requirement  that 

LVr°-  Vp^p*-  (7.14) 

j 


This  does  not  result  in  uniform  taxes  unless  at  most  one  tax  is  fixed  (in  which  case 
the  zero  degree  homogeneity  of  S allows  us  to  choose  any  level  of  proportional 
taxes).  In  particular,  suppose  all  taxes  but  tl  are  fixed,  and  t0  = t3  = • • ■ = tN  = 0. 
Then  there  is  one  condition,  corresponding  to  the  choice  of  tv  Using  com- 
pensated elasticities  e,-  -—  StJ(  ^/i,),  we  may  express  this  as 

e,= -e2el2/en,  (7.15) 

where  0,  = ti/pi  is  the  proportional  tax  on  good  i.  Since  eu  < 0,  this  calls  for  a tax 
on  good  1 (assuming  02  > 0)  if  e12  > 0,  and  a subsidy  if  s12  < 0.  If  the  distorted 
good  is  a substitute  to  good  1,  a tax  on  good  1 will  shift  consumption  into  good  2, 
lessening  the  original  distortion.  Taxing  a complement,  however,  would  worsen 
the  distortion.  (Compare  butter  and  margarine  vs.  left  shoes  and  right  shoes.) 
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In  the  wider  case  in  which  there  are  several  pre-existing  distortions  and  a single 
free  instrument,  tv  the  condition  is 

- I>/i, Aio  (7-16) 

j*  i 


so  that  the  complement-substitute  rule  now  applies  to  the  tax-weighted  commod- 
ity average.  More  generally,  when  several  instruments  can  be  set,  the  results  are 
more  complicated. 

Several  other  authors  have  considered  particular  restrictions  on  commodity 
taxation  and  profits  taxation  [for  example,  Dasgupta  and  Stiglitz  (1972)  and 
Mirrlees  (1972)]  and  the  effect  of  such  restrictions  on  the  desirability  of  produc- 
tion efficiency.  Auerbach  (1979b)  considers  the  particular  production  distortion  of 
differential  capital  income  taxation,  obtaining  a uniform  taxation  result  about 
separability  of  factors  in  production  that  closely  parallels  those  on  the  consump- 
tion side  already  discussed  in  Section  6. 


7.4.  Taxation  and  risk 

There  are  many  interesting  questions  that  concern  the  interaction  between  taxes 
and  risk-bearing.  A particular  one  that  fits  into  the  current  discussion  is  the 
optimal  taxation  of  risky  assets.  This  problem  was  first  examined  by  Stiglitz 
(1972)  and  extended  by  Auerbach  (1981).  The  basic  insight  is  that  the  optimal  tax 
results  already  derived  can  be  applied  directly  to  the  case  of  risky  assets  by 
imagining  the  commodities  being  taxed  to  be  Arrow-Debreu  state-contingent 
ones.  The  differences  that  arise  come  from  the  fact  that  we  normally  make 
different  assumptions  about  the  structure  of  utility  functions  and  the  complete- 
ness of  markets  when  we  deal  with  risk. 

The  basic  model  we  consider,  following  Stiglitz  (1972),  is  a two-period  model  in 
which  the  representative  individual  may  consume  a certain  good  (leisure)  out  of 
some  endowment,  and  may  purchase  one  of  two  linearly  independent  assets 
yielding  returns  in  two  states  at  date  1.  Because  the  two  assets  span  the  states  of 
nature,  the  consumer  may  purchase  any  combination  of  state-contingent  com- 
modities at  date  1,  and  there  is  a well-defined  implicit  price  for  each.  A corollary 
of  this  is  that  there  is  a unique  pair  of  tax  rates  on  commodities  in  the  two  states 
corresponding  to  each  tax  regime  that  applies  to  the  assets  themselves.  This  is 
helpful,  because  though  our  optimal  tax  results  apply  to  the  former,  actual  tax 
rules  normally  apply  to  the  latter.  In  the  more  general  case  without  asset 
spanning,  the  optimal  tax  problem  becomes  more  complicated,  just  as  it  would  if 
individual  commodities  in  a riskless  world  could  not  be  purchased  independently. 
Stiglitz  (1972)  obtained  his  main  result  concerning  the  relative  taxation  of  a risky 
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and  a riskless  asset  from  a direct  consideration  of  the  effects  of  taxation  on  asset 
demands.  It  is,  perhaps,  easier  to  see  the  connection  with  previous  results,  and  the 
effects  of  particular  assumptions,  if  we  begin  with  the  state-contingent  commod- 
ities themselves  [following  Auerbach  (1981)]. 

Letting  the  good  consumed  in  period  0 be  good  zero,  and  the  other  two 
commodities  be  labelled  1 and  2,  and  taking  good  0 to  be  numeraire,  we  have  the 
basic  optimal  tax  rule  (5.21),  which  we  write  here  for  convenience 

^1  _ e12  + e21  + e20 
02  e12  + e2i  + £io 

This  result  can  be  simplified  if  we  adopt  the  axioms  necessary  for  the  consumer  to 
engage  in  expected  utility  maximization.  In  this  case,  the  consumer’s  objective 
function  becomes 


(7.17) 


U(x0,  x1,  x2)  = ■nlUl{x0,  Xj)  + 7T2U2(x0,  x2). 


(7.18) 


where  U1(-)=  f/2(-)>  ■ni  is  the  possibility  of  state  i occurring,  and  e10  and  e20 
may  be  expressed  as 


*/o  = M 


Ulitj 

V{  , 


+Pjxj 


din  {Uj/Uj) 


dxn 


1=1,2,  j =2,1,  (7.19) 


where  M is  a positive  constant  and  Ut  and  Vlj  are  first  and  second  derivatives  of 
utility.  The  second  term  in  brackets  in  (7.19)  is  familiar  from  Section  6,  and 
equals  zero  if  preferences  are  weakly  separable  between  periods.  If  this  is  so  (in 
which  case,  utility  is  also  strongly  separable,  since  it  is  already  assumed  separable 
between  states),  then  the  tax  on  good  1 should  be  higher  than  that  on  good  2 if 
and  only  if  — (U22xl/U2)>  —(U22x2/U2),  but  these  are  just  the  Arrow  (1965)— 
Pratt  (1964)  measures  of  relative  risk-aversion  in  the  two  states.  Intuitively,  as  an 
individual  becomes  more  risk-averse,  his  behavior  becomes  less  responsive  to 
differences  in  rates  of  return.  Thus,  a tax  is  less  distortionary. 

That  taxes  should  be  equal  when  relative  risk  aversion  is  constant  is  not 
suprising,  even  without  knowledge  of  the  basic  optimal  tax  results.  It  is  for  this 
class  of  preferences  that  the  basic  results  of  Samuelson  (1969)  and  Merton  (1969) 
concerning  the  separation  of  portfolio  and  savings  decision  apply.  If  we  cannot 
influence  the  amount  of  savings,  and  hence  leisure  consumed,  by  inducing 
portfolio  shifts,  then  such  a relative  distortion  has  no  benefit. 

To  convert  these  results  to  the  taxes  on  the  two  assets  themselves,  which  we 
label  A and  B , we  use  the  fact  [see  Auerbach  (1981)]  that 


sgn(  oA-eB)  = sgn  (r>!  - ryB 


)sgn(0j  -02), 


(7.20) 
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where  rj  is  the  return  in  state  i of  asset  j.  Assuming  one  asset,  which  we  take  to 
be  asset  A without  loss  of  generality,  is  risk-free,  then  the  tax  should  be  greater 
(smaller)  on  the  risky  asset  B if  relative  risk-aversion  is  higher  (lower)  in  the  state 
with  the  higher  (lower)  return.  In  other  words,  the  risky  asset  should  face  a higher 
or  lower  tax  than  the  safe  asset  according  to  whether  relative  risk-aversion  is 
increasing  or  decreasing  [Stiglitz  (1972)].  More  generally,  if  both  assets  are  risky, 
then  one  can  apply  any  standard  notion  of  increasing  risk  [Rothschild  and  Stiglitz 
(1970)]  to  argue  that  if  asset  B is  riskier  than  asset  A,  its  return  will  be  more 
dispersed  and  hence  {r\rg  - rjrjj)  will  be  positive.  This  will  yield  a similar  result 
for  taxation  of  the  riskier  asset. 

It  is  important  to  recognize  that  these  results  assume  complete,  competitive 
markets.  While  a common  assumption  without  risk,  it  is  less  acceptable  when  the 
commodities  concerned  are  state-contingent.  (The  same  critique  also  applies  to 
intertemporal  problems  with  date-indexed  goods.)  In  particular,  we  are  implicitly 
assuming  that  the  government  cannot  increase  the  diversification  of  risk  by 
collecting  risky  taxes  and  pooling  them.  In  a real  world  context  where  many 
assets  are  not  traded,  this  may  be  a highly  questionable  restriction  to  impose. 

A second  issue  of  taxation  and  risk  concerns  the  question  of  whether  the 
government  can  increase  the  welfare  of  the  representative  individual  by  inducing 
risk  through  the  tax  system.  Normally,  risk  averse  individuals  are  made  worse  off 
by  being  forced  to  bear  risk.  However,  the  optimal  taxation  equilibrium  is  a 
distorted  one,  and  the  famous  dictum  of  Lipsey  and  Lancaster  (1956-57)  applies 
here:  once  one  condition  for  a Pareto  optimum  is  violated,  there  is  no  reason  to 
expect  that  the  violation  of  others  will  necessarily  worsen  matters. 

There  are  two  general  strands  in  the  literature  that  deal  with  the  use  of  induced 
risk  as  a policy  tool.  Weiss  (1976)  and  Stiglitz  (1982)  show  that  a random  tax 
system,  or  one  in  which  there  is  tax  evasion  with  a probability  of  detection,  may 
be  superior  to  a certain  tax  system  because,  under  specified  conditions  with 
respect  to  individual  preferences,  such  risk  may  lessen  the  labor  supply  distortion 
of  the  income  tax.  [Also  see  Sandmo  (1981)  on  the  subject  of  tax  evasion.] 

A second  issue  relates  to  the  case  of  several  individuals,  and  arises  from  the 
possibility  that  in  the  presence  of  indirect  taxation,  the  utility  possibility  frontier 
may  be  non-convex.  Even  with  identical  individuals,  then,  we  might  wish  to  tax 
the  consumption  of  the  same  good  by  different  individuals  at  different  rates 
[Atkinson  and  Stiglitz  (1976,1980),  Stiglitz  (1982),  Balcer  and  Sadka  (1982)].  This 
is  depicted  in  Figure  7.1.  Suppose  two  individuals,  1 and  2,  have  identical 
preferences  and  consume  goods  and  leisure.  If  we  seek  to  maximize  (t/x  + U2)  by 
choosing  individual-specific  excise  taxes  on  consumption,  the  first-order  condition 
will  be  zero  with  equal  taxes  at  Ul=  U2  = UE,  by  the  symmetry  of  the  problem. 
But  this  may  represent  a local  minimum,  as  shown.  Social  welfare  may  be 
improved  by  choosing  either  point  A or  point  B.  This  represents  an  unequal 
treatment  of  equal  individuals  and  may  violate  proscriptions  of  such  horizontal 
inequity.  However,  suppose  the  tax  system  were  randomized  so  that  point  A were 
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Figure  7.1.  Optimal  taxation  and  non-convexities. 


chosen  half  the  time,  and  point  B the  other  half.  This  would  give  the  same 
expected  utility  to  each  individual.  Moreover,  it  would  yield  the  same  value  of  the 
social  welfare  function,  defined  on  individual  expected  utilities,  as  before  at  either 
A or  B, 


EUXA-  EU2  = \[UL  + Uh]  + \[Uh+  Ul]=Uh+U1-.  (7.21) 


Thus,  randomization  may  be  desirable. 


8.  Tax  reform 

All  of  the  optimal  tax  problems  analyzed  thus  far  share  in  common  the  fact  that 
global  optima  are  sought.  There  are  a number  of  new  issues  arising  from  a 
consideration  of  tax  reform,  rather  than  tax  design. 

One  problem  of  tax  reform  derives  from  the  existence  of  an  initial  allocation. 
Though  a new  tax  system  may  be  more  efficient  and  more  equitable  than  the 
existing  one,  the  transition  from  old  to  new  may  cause  a redistribution  of 
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resources  to  occur  than  in  itself  is  undesirable.  For  example,  it  has  often  been 
suggested  in  the  U.S.  that  the  tax  subsidy  for  state  and  municipal  bonds  be 
removed.  If  this  were  done  unexpectedly,  it  would  cause  a capital  loss  for  the 
holders  of  such  bonds,  but  not  for  other,  otherwise  identical  individuals.  Such 
treatment  may  be  thought  of  as  a violation  of  horizontal  equity  [Feldstein  (1976)] 
which  may  be  explicitly  accounted  for  in  an  expanded  social  welfare  function 
[King  (1983a)].  This  problem  undoubtedly  is  one  of  the  reasons  why  tax  reform  is 
so  difficult  to  achieve. 

A second  general  problem  of  tax  reform,  which  shall  be  the  main  focus  of  this 
section,  is  that  the  direction  in  which  to  move  from  the  current  system  is  not 
always  evident.  Even  if  all  distortions  can  be  reduced  somewhat,  this  may  not 
increase  economic  efficiency.  The  basic  difficulty  is  that  we  can  only  be  sure  that 
movement  in  the  direction  of  a global  optimum  will  improve  matters  if  we  are 
sufficiently  close  to  that  optimum  initially.  A related  problem  is  whether  one  can 
increase  economic  efficiency  in  a piecemeal  fashion,  by  removing  distortions  one 
at  a time.  In  general,  such  a scheme  for  tax  reform  may  decrease  welfare  along  the 
transition  path  to  a global  optimum.  Restrictions  on  preferences  and  production 
sufficient  to  prevent  this  are  extremely  restrictive  [Boadway  and  Harris  (1 977)]. 


8.1.  Moving  to  lump  sum  taxation 


Lump  sum  taxes  are  non-distortionary,  but  it  need  not  follow  that  partially 
reducing  distortions  and  replacing  them  with  lump  sum  taxes  will  improve 
efficiency.  One  case  in  which  it  will  is  when  the  distortionary  tax  rates  are  set  at 
each  point  of  the  transition  at  the  optimal  tax  rates  for  the  revenue  being 
collected  by  non-lump  sum  taxes.  That  is,  if  a certain  amount  of  revenue,  R,  is 
collected  initially  by  the  distortionary  taxes,  and  a lump  sum  tax  T is  introduced, 
the  new  taxes  should  be  those  optimal  for  collecting  R — T.  As  T increases,  this 
sequence  of  optimal  tax  rates  insures  a monotonic  increase  in  utility.  This  result  is 
due  to  Atkinson  and  Stern  (1974),  and  demonstrated  as  follows.  Consider  the 
optimal  tax  problem 

ma  xV(p,—T)  subject  to  (p  — q)x+T>R,  (8.1) 

T.p 

where  T is  the  lump  sum  tax  faced  by  the  individual.  Differentiating  the 
corresponding  Lagrangian  with  respect  to  T yields  the  effect  of  an  increase  in  T 
on  utility,  given  that  p is  chosen  optimally, 


Ase 

d T 


— A + jU 


^ J dy 


+ 1 


(8.2) 
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where  X,  a and  j a are  defined  in  the  usual  way  to  be  the  marginal  utility  of 
income,  the  social  marginal  utility  of  income  and  the  Lagrange  multiplier  on  the 
revenue  constraint.  However,  we  know  from  expression  (7.8)  that  jtx  > a,  so  utility 
must  increase  as  T does:  when  the  tax  vector  t is  chosen  optimally,  there  is  always 
a positive  marginal  excess  burden  to  revenue  collection. 

Unfortunately,  this  is  not  a very  realistic  assumption  to  make  in  the  current 
context.  The  taxes  we  may  wish  to  reform  may  cause  unnecessarily  large  distor- 
tions, and  we  may  be  restricted  to  a proportional  reduction  formula,  or  some 
other  constraint  on  how  they  are  to  be  lowered. 

Consider  the  case  of  an  arbitrary  change  in  the  levels  of  excise  taxes  t and  lump 
sum  taxes  T for  the  case  of  a single  individual  and  fixed  producer  prices.  [This 
latter  assumption  can  be  relaxed;  see  Dixit  (1975).]  We  have  [following  Atkinson 
and  Stiglitz  (1980)] 

df/  = l|^df,.  — |^d  T=  -X(x-dt  + dT),  (8.3a) 

and 

d/?  = d(fx  + T)  = jcdr  + tdjc  + dT=0,  (8.3b) 

which  yields 

dU=\tdx.  (8.4) 

Utility  is  increased  by  the  tax  change  if  consumption  changes  to  increase  revenue 
from  the  existing  taxes,  thereby  reducing  the  associated  excess  burden. 

From  the  Slutsky  equation,  we  have 


d*  = dr-|^dr=5dt-|^(AT,df  + dr), 

op  d y dj 


(8.5) 


which,  combined  with  (8.3b)  and  (8.4),  yields 


dU  — 


X_ 

1 — t ■ dx/dy 


t'Sdt. 


(8.6) 


This  holds  for  any  change  in  t and  T,  and  can  be  useful  in  analyzing  particular 
kinds  of  tax  reforms.  For  example,  suppose  all  distortions  are  reduced  propor- 
tionally, i.e.,  d/  = -bt.  Then  because  S is  negative  semi-definite,  dU  > 0 if  and 
only  if  (1  — / - dx/dy)  > 0 [Dixit  (1975)].  This  condition  says  that  a dollar  increase 
in  income  causes  the  consumer  to  pay  less  than  a dollar  in  additional  excise  taxes. 
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Since  p = q + t,  it  is  equivalent  to  the  requirement  that  q ■ x increase  with  y:  as 
the  consumer  spends  more,  the  social  cost  of  the  goods  purchased  also  increases. 
If  this  condition  is  violated,  then  it  is  possible  that  multiple  equilibria  exist,  and 
the  tax  reduction  may  move  the  economy  away  from  the  undistorted  optimum 
[Foster  and  Sonnenschein  (1970)]. 

This  may  be  demonstrated  graphically  [following  Hatta  (1977)]  for  the  simple 
case  in  which  there  are  only  two  goods.  Suppose  that  a certain  revenue  R 
(measured  in  units  of  commodity  1)  must  be  raised,  and  that  the  consumer  has  an 
endowment  xv  The  possible  equilibria  lie  along  the  social  production  constraint 
M in  Figure  8.1.  Superimposed  on  this  constraint  are  a series  of  indifference 
curves,  the  highest  feasible  one  passing  through  point  A,  the  undistorted  opti- 
mum. Normally,  we  would  expect  that  as  we  travel  along  M from  point  A toward 
either  axis,  decreasing  the  feasible  utility  level,  the  marginal  rate  of  substitution 
between  xx  and  jc2  changes  monotonically.  (This  is  true,  of  course,  for  move- 
ments along  an  indifference  curve  and,  hence,  for  local  movements  away  from  A 
along  M , where  there  is  no  first-order  income  effect.)  If  this  is  the  case,  then  a 


Figure  8.1.  Prices  and  utility. 
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revenue-preserving  reduction  in  the  divergence  between  the  relative  price  of  x2 
and  its  social  cost,  in  terms  of  xl5  must  increase  utility,  for  it  will  induce  a 
movement  along  M toward  point  A.  However,  there  may  be  cases  in  which  there 
is  no  such  monotonicity,  and  a given  slope  may  occur  at  an  odd  number  of 
different  points  on  M,  not  just  one.  In  this  case,  reductions  in  the  price  distortion 
may  actually  move  the  consumer  away  from  point  A . 

That  this  possibility  is  equivalent  to  the  condition  derived  from  (8.6)  is 
demonstrated  graphically  in  Figure  8.2,  where  an  increase  in  lump  sum  income 
above  Jcj  causes  the  consumer  to  shift  from  point  B to  point  C,  inside  the 
production  constraint  M.  Since  the  indifference  curve  slopes  at  B and  C are 
the  same,  the  slope  at  D must  be  flatter  than  at  B.  Thus,  a steepening  of  the 
consumer’s  budget  line  resulting  from  a reduction  in  the  price  distortion  will 
cause  a movement  away  from  B,  along  M,  toward  the  Xj  axis  rather  than  toward 
D and  A,  thereby  lowering  the  consumer’s  utility. 

A particular  application  of  this  result  is  that  when  equilibrium  is  unique,  a 
consumption  tax  is  superior  to  a wage  tax  in  the  presence  of  pure  rents,  since  the 


Figure  8.2.  Multiple  equilibria  with  taxes. 


Ch.  2:  Excess  Burden  and  Optima / Taxation 


123 


former  tax  is  equivalent  to  the  latter  in  conjunction  with  a lump  sum  rent  tax 
[Helpman  and  Sadka  (1982)]. 

Another  result  that  follows  from  (8.6)  is  for  the  case  where  the  tax  distortion  is 
zero  for  one  good  (arbitrarily,  good  zero)  and  equiproportional  for  other  goods. 
That  is,  in  our  previous  notation,  t = Op.  Since  p'S  = 0,  (8.6)  may  be  rewritten  as 


d U=  - 


l-t-dx/dyS°'di’ 


(8.7) 


where  S0  = (S01,  S02,. . . , S0N).  A sufficient  condition  for  this  to  be  positive 
(assuming  dR/dy  < 1)  is  that  taxes  be  decreased  on  substitutes  for  good  zero 
( S0j  > 0)  and  increased  on  complements  [Dixit  (1975)]. 


8.2.  Reform  without  lump  sum  taxation 

This  problem  is  harder,  because  there  is  no  obvious  “first-best”  looming  in  the 
distance  to  guide  our  movement.  General  characterization  of  the  direction  in 
which  taxes  should  be  changed  is  a difficult  problem,  and  while  progress  has  been 
made  [Guesnerie  (1977),  Diewert  (1978)],  there  is  little  we  can  say  of  a concrete 
nature  without  further  assumptions. 

One  approach  that  sidesteps  this  problem  is  to  characterize  observable  changes 
in  equilibrium  that  will  result  if  welfare  is  improved.  Following  Pazner  and 
Sadka  (1981),  we  can  use  revealed  preference  theory  to  evaluate  a balanced 
budget  change  in  distortionary  taxes.  Let  t0=p0  — q be  the  initial  set  of  taxes 
(with  producer  prices  fixed)  and  tl=p1—  q be  the  prospective  change.  If  px  - x 1 
> p i jc0  (where  x0  and  x,  are  the  purchases  in  the  two  situations),  then  x,  is 
preferred  by  the  consumer.  Hence,  utility  has  increased.  However,  since  d(r  • x)  - 
0,  q ■ xt  = q ■ x0,  so  that  t1x1>  tx-  x0,  or  tx  ■ Ax  > 0.  [Note  the  similarity  of  this 
discrete  condition  to  (8.4).]  Likewise,  if  t0  ■ Ax  < 0,  the  original  situation  is 
preferred.  Unfortunately,  there  is  an  indeterminate  range  in  which  neither  of 
these  conditions  is  satisfied. 

If  we  assume  producer  prices  to  be  fixed  (here  this  restriction  is  necessary)  and 
that  all  goods  but  the  numeraire  are  taxed  uniformly,  then  we  can  characterize  a 
utility  increasing  tax  change.  The  three-good  case  was  analyzed  by  Corlett  and 
Hague  (1953-54),  with  a generalization  provided  by  Dixit  (1975),  whose  analysis 
we  follow.  Note  that  (8.4)  still  is  valid  in  determining  whether  a tax  change 
increases  utility.  However,  since  lump  sum  taxes  are  unavailable,  t ■ x = 0.  Using 
(8.5),  for  dT  = 0,  we  have 

0 = d(r  • x)  = x • dr  + r ■ dx  = x • dr  + r'^  dr  = f x'+  r^  j dr. 


(8.8) 
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where  A = 1 — t ■ dx/dy.  For  the  case  where  to  = 0 and  t = dp,  we  use  the 
homogeneity  of  S to  rewrite  this  as 

(x'-|sb')d/«0,  (8.9) 

which,  using  the  definition  of  compensated  elasticities  £tJ  = StJ(  Pj/x,),  may  be 
written 


From  (8.4),  we  have  (for  dT  = 0) 
d U = — Xx  ■ dr. 

If  we  assume  that 


(8.10) 


■x, 


( 


d R 
dr, 


is  positive,  and  make  the  related  assumption  that  A is  positive,  then  [comparing 
(8.9)  and  (8.10)],  in  changing  two  taxes,  we  should  decrease  the  one  for  which 


(8.11) 


is  smaller,  or  e/0  is  larger  - increase  the  tax  on  the  relative  complement.  This 
extends  in  an  obvious  way  if  we  choose  pairs  of  taxes  successively. 
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